This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Download Microsoft Edge
More info about Internet Explorer and Microsoft Edge
In this article
This article provides information and solutions to common errors and warnings you might encounter during indexing and AI enrichment in Azure AI Search.
Indexing stops when the error count exceeds
'maxFailedItems'
.
If you want indexers to ignore these errors (and skip over "failed documents"), consider updating the
maxFailedItems
and
maxFailedItemsPerBatch
as described
here
.
Each failed document along with its document key (when available) will show up as an error in the indexer execution status. You can utilize the
index api
to manually upload the documents at a later point if you have set the indexer to tolerate failures.
The error information in this article can help you resolve errors, allowing indexing to continue.
Warnings don't stop indexing, but they do indicate conditions that could result in unexpected outcomes. Whether you take action or not depends on the data and your scenario.
Where can you find specific indexer errors?
To verify an indexer status and identify errors in the Azure portal, follow the steps below:
Sign in to the
Azure portal
and
find your search service
.
On the left, expand
Search Management
>
Indexers
and select an indexer.
Under
Execution History
, select the status. All statuses, including Success, have details about the execution.
If there's an error, hover over the error message. A pane appears on the right side of your screen displaying detailed information about the error.
Transient errors
For various reasons, such as transient network communication interruptions, timeouts from long-running processes, or specific document nuances, it's common to encounter transient errors or warnings during indexer runs. However, these errors are temporary and should be resolved in subsequent indexer runs.
To manage these errors effectively, we recommend
putting your indexer on a schedule
, for instance, to run every five minutes, where the next run commences five minutes after completing the first run, adhering to the
maximum runtime limit
on your service. Regularly scheduled runs help rectify transient errors or warnings.
If an error persists over multiple indexer runs, it's likely not a transient issue. In such cases, refer to the list below for potential solutions.
Error properties
Property
Description
Example
The ID of the document impacted by the error or warning.
Azure Storage example, where the default ID is the metadata storage path:
https://<storageaccount>.blob.core.windows.net/jfk-1k/docid-32112954.pdf
The operation causing the error or warning. This is generated by the following structure:
[category]
.
[subcategory]
.
[resourceType]
.
[resourceName]
DocumentExtraction.azureblob.myBlobContainerName
Enrichment.WebApiSkill.mySkillName
Projection.SearchIndex.OutputFieldMapping.myOutputFieldName
Projection.SearchIndex.MergeOrUpload.myIndexName
Projection.KnowledgeStore.Table.myTableName
Message
A high-level description of the error or warning.
Could not execute skill because the Web Api request failed.
Details
Specific information that might be helpful in diagnosing the issue, such as the WebApi response if executing a custom skill failed.
link-cryptonyms-list - Error processing the request record : System.ArgumentNullException: Value cannot be null. Parameter name: source at System.Linq.Enumerable.All[TSource](IEnumerable 1 source, Func 2 predicate) at Microsoft.CognitiveSearch.WebApiSkills.JfkWebApiSkills. ...rest of stack trace...
DocumentationLink
A link to relevant documentation with detailed information to debug and resolve the issue. This link will often point to one of the below sections on this page.
https://go.microsoft.com/fwlink/?linkid=2106475
Inconsistent field types across different documents
Type of value has a mismatch with column type. Couldn't store '{47.6,-122.1}' in authors column. Expected type is JArray.
Error converting data type nvarchar to float.
Conversion failed when converting the nvarchar value '12 months' to data type int.
Arithmetic overflow error converting expression to data type int.
Ensure that the type of each field is the same across different documents. For example, if the first document
'startTime'
field is a DateTime, and in the second document it's a string, this error is hit.
Errors from the data source's underlying service
From Azure Cosmos DB:
{"Errors":["Request rate is large"]}
Check your storage instance to ensure it's healthy. You might need to adjust your scaling or partitioning.
Transient issues
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host
Occasionally there are unexpected connectivity issues. Try running the document through your indexer again later.
Blob is over the size limit
Document is '150441598' bytes, which exceeds the maximum size '134217728' bytes for document extraction for your current service tier.
Blob indexing errors
Blob has unsupported content type
Document has unsupported content type 'image/png'
Blob indexing errors
Blob is encrypted
Document could not be processed - it may be encrypted or password protected.
You can skip the blob with
blob settings
.
Transient issues
Error processing blob: The request was aborted: The request was canceled.
Document timed out during processing.
Occasionally there are unexpected connectivity issues. Try running the document through your indexer again later.
The document key is missing
Document key cannot be missing or empty
Ensure all documents have valid document keys. The document key is determined by setting the 'key' property as part of the
index definition
. Indexers emit this error when the property flagged as the 'key' can't be found on a particular document.
The document key is invalid
Invalid document key. Keys can only contain letters, digits, underscore (_), dash (-), or equal sign (=).
Ensure all documents have valid document keys. Review
Indexing Blob Storage
for more details. If you're using the blob indexer, and your document key is the
metadata_storage_path
field, make sure that the indexer definition has a
base64Encode mapping function
with
parameters
equal to
null
, instead of the path in plain text.
The document key is invalid
Document key cannot be longer than 1024 characters
Modify the document key to meet the validation requirements.
Couldn't apply field mapping to a field
Could not apply mapping function 'functionName' to field 'fieldName'. Array cannot be null. Parameter name: bytes
Double check the
field mappings
defined on the indexer, and compare with the data of the specified field of the failed document. It might be necessary to modify the field mappings or the document data.
Couldn't read field value
Could not read the value of column 'fieldName' at index 'fieldIndex'. A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.)
These errors are typically due to unexpected connectivity issues with the data source's underlying service. Try running the document through your indexer again later.
Error: Could not map output field 'xyz' to search index due to deserialization problem while applying mapping function 'abc'
The output mapping might have failed because the output data is in the wrong format for the mapping function you're using. For example, applying
Base64Encode
mapping function on binary data would generate this error. To resolve the issue, either rerun indexer without specifying mapping function or ensure that the mapping function is compatible with the output field data type. See
Output field mapping
for details.
Error: Could not execute skill
The indexer wasn't able to run a skill in the skillset.
Reason
Details/Example
Resolution
Transient connectivity issues
A transient error occurred. Try again later.
Occasionally there are unexpected connectivity issues. Try running the document through your indexer again later.
Potential product bug
An unexpected error occurred.
This indicates an unknown class of failure and can indicate a product bug. File a
support ticket
to get help.
A skill has encountered an error during execution
(From Merge Skill) One or more offset values were invalid and couldn't be parsed. Items were inserted at the end of the text
Use the information in the error message to fix the issue. This kind of failure requires action to resolve.
Error: Could not execute skill because the Web API request failed
The skill execution failed because the call to the Web API failed. Typically, this class of failure occurs when custom skills are used, in which case you need to debug your custom code to resolve the issue. If instead the failure is from a built-in skill, refer to the error message for help with fixing the issue.
While debugging this issue, be sure to pay attention to any
skill input warnings
for this skill. Your Web API endpoint might be failing because the indexer is passing it unexpected input.
Error: Could not execute skill because Web API skill response is invalid
The skill execution failed because the call to the Web API returned an invalid response. Typically, this class of failure occurs when custom skills are used, in which case you need to debug your custom code to resolve the issue. If instead the failure is from a built-in skill, file a
support ticket
to get assistance.
Error: Type of value has a mismatch with column type. Couldn't store in 'xyz' column. Expected type is 'abc'
If your data source has a field with a different data type than the field you're trying to map in your index, you might encounter this error. Check your data source field data types and make sure they're
mapped correctly to your index data types
.
Error: Skill did not execute within the time limit
There are two cases under which you might encounter this error message, each of which should be treated differently. Follow the instructions below depending on what skill returned this error for you.
Built-in Azure AI services skills
Many of the built-in cognitive skills, such as language detection, entity recognition, or OCR, are backed by an Azure AI services API endpoint. Sometimes there are transient issues with these endpoints and a request will time out. For transient issues, there's no remedy except to wait and try again. As a mitigation, consider setting your indexer to
run on a schedule
. Scheduled indexing picks up where it left off. Assuming transient issues are resolved, indexing and cognitive skill processing should be able to continue on the next scheduled run.
If you continue to see this error on the same document for a built-in cognitive skill, file a
support ticket
to get assistance, as this isn't expected.
Custom skills
If you encounter a timeout error with a custom skill, there are a couple of things you can try. First, review your custom skill and ensure that it's not getting stuck in an infinite loop and that it's returning a result consistently. Once you have confirmed that a result is returned, check the duration of execution. If you didn't explicitly set a
timeout
value on your custom skill definition, then the default
timeout
is 30 seconds. If 30 seconds isn't long enough for your skill to execute, you can specify a higher
timeout
value on your custom skill definition. Here's an example of a custom skill definition where the timeout is set to 90 seconds:
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"uri": "<your custom skill uri>",
"batchSize": 1,
"timeout": "PT90S",
"context": "/document",
"inputs": [
"name": "input",
"source": "/document/content"
"outputs": [
"name": "output",
"targetName": "output"
The maximum value that you can set for the
timeout
parameter is 230 seconds. If your custom skill is unable to execute consistently within 230 seconds, you might consider reducing the
batchSize
of your custom skill so that it has fewer documents to process within a single execution. If you have already set your
batchSize
to 1, you need to rewrite the skill to be able to execute in under 230 seconds, or otherwise split it into multiple custom skills so that the execution time for any single custom skill is a maximum of 230 seconds. Review the
custom skill documentation
for more information.
Error: Could not 'MergeOrUpload' | 'Delete' document to the search index
The document was read and processed, but the indexer couldn't add it to the search index. This can happen due to:
Reason
Details/Example
Resolution
A field contains a term that is too large
A term in your document is larger than the
32-KB limit
You can avoid this restriction by ensuring the field isn't configured as filterable, facetable, or sortable.
Document is too large to be indexed
A document is larger than the
maximum API request size
How to index large data sets
Document contains too many objects in collection
A collection in your document exceeds the
maximum elements across all complex collections limit
.
The document with key '1000052' has '4303' objects in collections (JSON arrays). At most '3000' objects are allowed to be in collections across the entire document. Remove objects from collections and try indexing the document again.
We recommend reducing the size of the complex collection in the document to below the limit and avoid high storage utilization.
Trouble connecting to the target index (that persists after retries) because the service is under other load, such as querying or indexing.
Failed to establish connection to update index. Search service is under heavy load.
Scale up your search service
Search service is being patched for service update, or is in the middle of a topology reconfiguration.
Failed to establish connection to update index. Search service is currently down/Search service is undergoing a transition.
Configure service with at least three replicas for 99.9% availability per
SLA documentation
Failure in the underlying compute/networking resource (rare)
Failed to establish connection to update index. An unknown failure occurred.
Configure indexers to
run on a schedule
to pick up from a failed state.
An indexing request made to the target index wasn't acknowledged within a timeout period due to network issues.
Couldn't establish connection to the search index in a timely manner.
Configure indexers to
run on a schedule
to pick up from a failed state. Additionally, try lowering the indexer
batch size
if this error condition persists.
Data type of one or more fields extracted by the indexer is incompatible with the data model of the corresponding target index field.
The data field '_data_' in the document with key '888' has an invalid value 'of type 'Edm.String''. The expected type was 'Collection(Edm.String)'.
Failed to extract any JSON entity from a string value.
Could not parse value 'of type 'Edm.String'' of field '_data_' as a JSON object.
Error:'After parsing a value an unexpected character was encountered: ''. Path '_path_', line 1, position 3162.'
Failed to extract a collection of JSON entities from a string value.
Could not parse value 'of type 'Edm.String'' of field '_data_' as a JSON array.
Error:'After parsing a value an unexpected character was encountered: ''. Path '[0]', line 1, position 27.'
An unknown type was discovered in the source document.
Unknown type '_unknown_' cannot be indexed
An incompatible notation for geography points was used in the source document.
WKT POINT string literals are not supported. Use GeoJson point literals instead
In all these cases, refer to
Supported Data types
and
Data type map for indexers
to make sure that you build the index schema correctly and have set up appropriate
indexer field mappings
. The error message includes details that can help track down the source of the mismatch.
Error: Integrated change tracking policy cannot be used because table has a composite primary key
This applies to SQL tables, and usually happens when the key is either defined as a composite key or, when the table has defined a unique clustered index (as in a SQL index, not an Azure Search index). The main reason is that the key attribute is modified to be a composite primary key in a
unique clustered index
. In that case, make sure that your SQL table doesn't have a unique clustered index, or that you map the key field to a field that is guaranteed not to have duplicate values.
Error: Could not process document within indexer max run time
This error occurs when the indexer is unable to finish processing a single document from the data source within the allowed execution time.
Maximum running time
is shorter when skillsets are used. When this error occurs, if you have maxFailedItems set to a value other than 0, the indexer bypasses the document on future runs so that indexing can progress. If you can't afford to skip any document, or if you're seeing this error consistently, consider breaking documents into smaller documents so that partial progress can be made within a single indexer execution.
Error: Could not project document
This error occurs when the indexer is attempting to
project data into a knowledge store
and there was a failure on the attempt. This failure could be consistent and fixable, or it could be a transient failure with the projection output sink that you might need to wait and retry in order to resolve. Here's a set of known failure states and possible resolutions.
Reason
Details/Example
Resolution
Couldn't update projection blob
'blobUri'
in container
'containerName'
The specified container doesn't exist.
The indexer checks if the specified container has been previously created and will create it if necessary, but it only performs this check once per indexer run. This error means that something deleted the container after this step. To resolve this error, try this: leave your storage account information alone, wait for the indexer to finish, and then rerun the indexer.
Couldn't update projection blob
'blobUri'
in container
'containerName'
Unable to write data to the transport connection: An existing connection was forcibly closed by the remote host.
This is expected to be a transient failure with Azure Storage and thus should be resolved by rerunning the indexer. If you encounter this error consistently, file a
support ticket
so it can be investigated further.
Couldn't update row
'projectionRow'
in table
'tableName'
The server is busy.
This is expected to be a transient failure with Azure Storage and thus should be resolved by rerunning the indexer. If you encounter this error consistently, file a
support ticket
so it can be investigated further.
Error: The cognitive service for skill '<skill-name>' has been throttled
Skill execution failed because the call to Azure AI services was throttled. Typically, this class of failure occurs when too many skills are executing in parallel. If you're using the Microsoft.Search.Documents client library to run the indexer, you can use the
SearchIndexingBufferedSender
to get automatic retry on failed steps. Otherwise, you can
reset and rerun the indexer
.
An 'Expected IndexAction metadata' error means when the indexer attempted to read the document to identify what action should be taken, it didn't find any corresponding metadata on the document. Typically, this error occurs when the indexer has an annotation cache added or removed without resetting the indexer. To address this, you should
reset and rerun the indexer
.
An input to the skill was missing, it has the wrong type, or otherwise, invalid. You might see the following information:
Could not execute skill
Skill executed but may have unexpected results
Cognitive skills have required inputs and optional inputs. For example, the
Key phrase extraction skill
has two required inputs
text
,
languageCode
, and no optional inputs. Custom skill inputs are all considered optional inputs.
If necessary inputs are missing or if the input isn't the right type, the skill gets skipped and generates a warning. Skipped skills don't generate outputs. If downstream skills consume the outputs of the skipped skill, they can generate other warnings.
If an optional input is missing, the skill still runs, but it might produce unexpected output due to the missing input.
In both cases, this warning is due to the shape of your data. For example, if you have a document containing information about people with the fields
firstName
,
middleName
, and
lastName
, you might have some documents that don't have an entry for
middleName
. If you pass
middleName
as an input to a skill in the pipeline, then it's expected that this skill input is missing some of the time. You need to evaluate your data and scenario to determine whether or not any action is required as a result of this warning.
If you want to provide a default value for a missing input, you can use the
Conditional skill
to generate a default value and then use the output of the
Conditional skill
as the skill input.
"@odata.type": "#Microsoft.Skills.Util.ConditionalSkill",
"context": "/document",
"inputs": [
{ "name": "condition", "source": "= $(/document/language) == null" },
{ "name": "whenTrue", "source": "= 'en'" },
{ "name": "whenFalse", "source": "= $(/document/language)" }
"outputs": [ { "name": "output", "targetName": "languageWithDefault" } ]
Skill input is the wrong type
"Required skill input was not of the expected type
String
. Name:
text
, Source:
/document/merged_content
." "Required skill input wasn't of the expected format. Name:
text
, Source:
/document/merged_content
." "Cannot iterate over non-array
/document/normalized_images/0/imageCelebrities/0/detail/celebrities
." "Unable to select
0
in non-array
/document/normalized_images/0/imageCelebrities/0/detail/celebrities
"
Certain skills expect inputs of particular types, for example
Sentiment skill
expects
text
to be a string. If the input specifies a nonstring value, then the skill doesn't execute and generates no outputs. Ensure your data set has input values uniform in type, or use a
Custom Web API skill
to preprocess the input. If you're iterating the skill over an array, check the skill context and input have
*
in the correct positions. Usually both the context and input source should end with
*
for arrays.
Skill input is missing
Required skill input is missing. Name: text, Source: /document/merged_content
Missing value /document/normalized_images/0/imageTags.
Unable to select 0 in array /document/pages of length 0.
If this warning occurs for all documents, there could be a typo in the input paths. Check the property name casing. Check for an extra or missing
*
in the path. Verify that the documents from the data source provide the required inputs.
Skill language code input is invalid
Skill input
languageCode
has the following language codes
X,Y,Z
, at least one of which is invalid.
See more details below.
One or more of the values passed into the optional
languageCode
input of a downstream skill isn't supported. This can occur if you're passing the output of the
LanguageDetectionSkill
to subsequent skills, and the output consists of more languages than are supported in those downstream skills.
Note that you can also get a warning similar to this one if an invalid
countryHint
input gets passed to the LanguageDetectionSkill. If that happens, validate that the field you're using from your data source for that input contains valid ISO 3166-1 alpha-2 two letter country codes. If some are valid and some are invalid, continue with the following guidance but replace
languageCode
with
countryHint
and
defaultLanguageCode
with
defaultCountryHint
to match your use case.
If you know that your data set is all in one language, you should remove the
LanguageDetectionSkill
and the
languageCode
skill input and use the
defaultLanguageCode
skill parameter for that skill instead, assuming the language is supported for that skill.
If you know that your data set contains multiple languages and thus you need the
LanguageDetectionSkill
and
languageCode
input, consider adding a
ConditionalSkill
to filter out the text with languages that aren't supported before passing in the text to the downstream skill. Here's an example of what this might look like for the EntityRecognitionSkill:
"@odata.type": "#Microsoft.Skills.Util.ConditionalSkill",
"context": "/document",
"inputs": [
{ "name": "condition", "source": "= $(/document/language) == 'de' || $(/document/language) == 'en' || $(/document/language) == 'es' || $(/document/language) == 'fr' || $(/document/language) == 'it'" },
{ "name": "whenTrue", "source": "/document/content" },
{ "name": "whenFalse", "source": "= null" }
"outputs": [ { "name": "output", "targetName": "supportedByEntityRecognitionSkill" } ]
Here are some references for the currently supported languages for each of the skills that can produce this error message:
EntityRecognitionSkill supported languages
EntityLinkingSkill supported languages
KeyPhraseExtractionSkill supported languages
LanguageDetectionSkill supported languages
PIIDetectionSkill supported languages
SentimentSkill supported languages
Translator supported languages
Text SplitSkill
supported languages:
da, de, en, es, fi, fr, it, ko, pt
Cognitive skills limit the length of text that can be analyzed at one time. If the text input exceeds the limit, the text is truncated before it's enriched. The skill executes, but not over all of your data.
In the example
LanguageDetectionSkill
below, the
'text'
input field might trigger this warning if the input is over the character limit. Input limits can be found in the
skills reference documentation
.
"@odata.type": "#Microsoft.Skills.Text.LanguageDetectionSkill",
"inputs": [
"name": "text",
"source": "/document/text"
"outputs": [...]
If you want to ensure that all text is analyzed, consider using the
Split skill
.
Warning: Web API skill response contains warnings
The indexer ran the skill in the skillset, but the response from the Web API request indicates there are warnings. Review the warnings to understand how your data is impacted and whether further action is required.
Warning: The current indexer configuration does not support incremental progress
This warning only occurs for Azure Cosmos DB data sources.
Incremental progress during indexing ensures that if indexer execution is interrupted by transient failures or execution time limit, the indexer can pick up where it left off next time it runs, instead of having to reindex the entire collection from scratch. This is especially important when indexing large collections.
The ability to resume an unfinished indexing job is predicated on having documents ordered by the
_ts
column. The indexer uses the timestamp to determine which document to pick up next. If the
_ts
column is missing or if the indexer can't determine if a custom query is ordered by it, the indexer starts at beginning and you'll see this warning.
It's possible to override this behavior, enabling incremental progress and suppressing this warning by using the
assumeOrderByHighWaterMarkColumn
configuration property.
For more information, see
Incremental progress and custom queries
.
Warning: Some data was lost during projection. Row 'X' in table 'Y' has string property 'Z' which was too long.
The
Table Storage service
has limits on how large
entity properties
can be. Strings can have 32,000 characters or less. If a row with a string property longer than 32,000 characters is being projected, only the first 32,000 characters are preserved. To work around this issue, avoid projecting rows with string properties longer than 32,000 characters.
Indexers limit how much text can be extracted from any one document. This limit depends on the pricing tier: 32,000 characters for Free tier, 64,000 for Basic, 4 million for Standard, 8 million for Standard S2, and 16 million for Standard S3. Text that was truncated won't be indexed. To avoid this warning, try breaking apart documents with large amounts of text into multiple, smaller documents.
For more information, see
Indexer limits
.
Warning: Could not map output field 'X' to search index
Output field mappings that reference non-existent/null data will produce warnings for each document and result in an empty index field. To work around this issue, double-check your output field-mapping source paths for possible typos, or set a default value using the
Conditional skill
. See
Output field mapping
for details.
Reason
Details/Example
Resolution
Can't iterate over non-array
"Cannot iterate over non-array
/document/normalized_images/0/imageCelebrities/0/detail/celebrities
."
This error occurs when the output isn't an array. If you think the output should be an array, check the indicated output source field path for errors. For example, you might have a missing or extra
*
in the source field name. It's also possible that the input to this skill is null, resulting in an empty array. Find similar details in
Skill Input was Invalid
section.
Unable to select
0
in non-array
"Unable to select
0
in non-array
/document/pages
."
This could happen if the skills output doesn't produce an array and the output source field name has array index or
*
in its path. Double check the paths provided in the output source field names and the field value for the indicated field name. Find similar details in
Skill Input was Invalid
section.
Data change detection policies
have specific requirements for the columns they use to detect change. One of these requirements is that this column is updated every time the source item is changed. Another requirement is that the new value for this column is greater than the previous value. Key columns don't fulfill this requirement because they don't change on every update. To work around this issue, select a different column for the change detection policy.
Warning: Document text appears to be UTF-16 encoded, but is missing a byte order mark
The
indexer parsing modes
need to know how text is encoded before parsing it. The two most common ways of encoding text are UTF-16 and UTF-8. UTF-8 is a variable-length encoding where each character is between 1 byte and 4 bytes long. UTF-16 is a fixed-length encoding where each character is 2 bytes long. UTF-16 has two different variants,
big endian
and
little endian
. Text encoding is determined by a
byte order mark
, a series of bytes before the text.
Encoding
Byte Order Mark
If no byte order mark is present, the text is assumed to be encoded as UTF-8.
To work around this warning, determine what the text encoding for this blob is and add the appropriate byte order mark.
Warning: Azure Cosmos DB collection 'X' has a Lazy indexing policy. Some data may be lost
Collections with
Lazy
indexing policies can't be queried consistently, resulting in your indexer missing data. To work around this warning, change your indexing policy to Consistent.
Warning: The document contains very long words (longer than 64 characters). These words may result in truncated and/or unreliable model predictions.
This warning is passed from the Language service of Azure AI services. In some cases, it's safe to ignore this warning, for example if the long string is just a long URL. Be aware that when a word is longer than 64 characters, it's 'truncated to 64 characters which can affect model predictions.
Indexers have
document size limits
. Make sure that the documents in your data source are smaller than the supported size limit, as documented for your service tier.