Why is it when creating an index, my indexer is stuck in "validating" for hours?

Question

Only trying to index 10 json documents from a blob. Not even 1MB i believe.

Answer

I found at least one reason (reliably causes this problem for me):

Repro steps:

source documents are JSON in an Azure Blob Storage
during index field configuration, as a key field, the meatadata_storage_path is suggested by default.. If you don't accept this but
expand the "content" field (so it shows all your own JSON fields)
select e.g. "id" from your content fields as the key field

you will get stuck in "Validating..."

Other problems with this configuration:

Cognitive Skills will not work, because you can only set them on the "metadata_*" fields and not on your content field

Probably there are more problems when importing JSON. ~~You must also take great care to select "JSON" as the parsing mode in advanced settings on the last page of the wizard, otherwise~~ (Keep Default. JSON mode created other problems for me. So your documents may be too long for Free or Basic SKU (32k and 64k, respectively).

My gut feeling is that this simple indexer mostly will work for plain txt or PDF documents, and not so much for structured JSON.

Answer

Thanks for reply! When indexing data into Azure Cognitive Search the main categories of failure include:

Connecting to a data source or other resources
Document processing
Document ingestion to an index

Refer to this document link for troubleshooting: https://video2.skills-academy.com/en-us/azure/search/search-indexer-troubleshooting

Note that the Indexers have limited support for accessing data sources and other resources that are secured by Azure network security mechanisms. Currently, indexers can only access data sources via corresponding IP address range restriction mechanisms or NSG rules when applicable.

Let us know.

Share via

Why is it when creating an index, my indexer is stuck in "validating" for hours?

2 answers