Why is it when creating an index, my indexer is stuck in "validating" for hours?

Sebastian Hernandez 6 Reputation points
2021-06-12T19:41:17.083+00:00

Only trying to index 10 json documents from a blob. Not even 1MB i believe.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
859 questions
{count} vote

2 answers

Sort by: Most helpful
  1. Peter Resele 46 Reputation points
    2024-02-07T16:10:04.16+00:00

    I found at least one reason (reliably causes this problem for me):

    Repro steps:

    • source documents are JSON in an Azure Blob Storage
    • during index field configuration, as a key field, the meatadata_storage_path is suggested by default.. If you don't accept this but
    • expand the "content" field (so it shows all your own JSON fields)
    • select e.g. "id" from your content fields as the key field

    you will get stuck in "Validating..."

    Other problems with this configuration:

    • Cognitive Skills will not work, because you can only set them on the "metadata_*" fields and not on your content field

    Probably there are more problems when importing JSON. You must also take great care to select "JSON" as the parsing mode in advanced settings on the last page of the wizard, otherwise (Keep Default. JSON mode created other problems for me. So your documents may be too long for Free or Basic SKU (32k and 64k, respectively).

    My gut feeling is that this simple indexer mostly will work for plain txt or PDF documents, and not so much for structured JSON.

    1 person found this answer helpful.

  2. SnehaAgrawal-MSFT 20,396 Reputation points
    2021-06-15T14:49:59.193+00:00

    Thanks for reply! When indexing data into Azure Cognitive Search the main categories of failure include:

    1. Connecting to a data source or other resources
    2. Document processing
    3. Document ingestion to an index

    Refer to this document link for troubleshooting: https://video2.skills-academy.com/en-us/azure/search/search-indexer-troubleshooting

    Note that the Indexers have limited support for accessing data sources and other resources that are secured by Azure network security mechanisms. Currently, indexers can only access data sources via corresponding IP address range restriction mechanisms or NSG rules when applicable.

    Let us know.