Is there a way to process OCR files (Vision 4.0 or Document Intelligence) asynchronously, using a webhook callback instead of polling for completion?

Asynchronous OCR processing

navba-MSFT 24,890 Microsoft Employee

@victor uceda uceda Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

.

You can use Azure Computer Vision for OCR with asynchronous processing. You can set up an asynchronous workflow, but native webhook callbacks are not directly supported. However, you can implement a webhook callback mechanism yourself to avoid continuous polling.

.

Steps to Achieve Asynchronous OCR Processing Using a Webhook

Submit the OCR Request Asynchronously

Use the POST request for the OCR operation (either in Document Intelligence or Azure Computer Vision).
Include the webhook URL in your request payload so that Azure can notify you when the processing is complete.

Set Up Your Webhook Endpoint

Create an endpoint in your application that can receive HTTP POST requests.
This endpoint will handle the incoming POST request from Azure once the OCR processing is complete.

Handle the Webhook Callback

When Azure completes the OCR processing, it will send a POST request to your webhook URL with the results.
Your webhook endpoint should process the incoming data and handle the OCR results as needed.

.

The below example will use a webhook callback to avoid continuous polling.

Step 1: OCR submit sample:

var ocrRequestUrl = "https://<your-azure-endpoint>/vision/v4.0/read/analyze";
var documentPayload = new
{
    url = "<url-of-your-document>",
    webhookUrl = "https://your-webhook-url"
};

var content = new StringContent(JsonConvert.SerializeObject(documentPayload), Encoding.UTF8, "application/json");
var response = await client.PostAsync(ocrRequestUrl, content);

.

Step 2: Webhook Endpoint to Handle Callback

[HttpPost]
public async Task<IActionResult> WebhookCallback([FromBody] OcrResult result)
{
    if (result.Status == "succeeded")
    {
        // Process the OCR results
        var ocrData = result.AnalyzeResult;
        // Do something with the OCR data
    }

    return Ok();
}

. This approach eliminates the need for continuous polling and leverages webhooks to handle the OCR results asynchronously. .

Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

victor uceda uceda 0

Hi, thanks for your reply!
but.... are you sure? also in the document intelligence models... I tried it without success, I used this nodeJS code:

      `${endpoint}documentintelligence/documentModels/prebuilt-read:analyze?_overload=analyzeDocument&api-version=2024-07-31-preview`,
      {
        url: documentUrl,
        //webhookUrl: webhookUrl,
      },
      {
        headers: {
          'Ocp-Apim-Subscription-Key': subscriptionKey,
          'Content-Type': 'application/json',
        },
      },
    );

with the webhookUrl parameter the response is an 400, statusText: 'Bad Request', 'ms-azure-ai-errorcode': 'InvalidArgument' (without including webhookUrl ends ok).
Am I doing something wrong?

navba-MSFT 24,890 Microsoft Employee

@victor uceda uceda I haven't tested my above suggestion. Since you have tried it and it is not working, we will follow the alternative approach.

.

Approach 1:

You can try to modify the implementation you mentioned and use a while loop that continuously checks if the result is available, along with a mechanism to cancel the wait, either after a maximum waiting time or a set number of retries.

Here is an example from our Computer Vision GitHub sample:

// If the first REST API method completes successfully, the second 
                // REST API method retrieves the text written in the image.
                //
                // Note: The response may not be immediately available. Text
                // recognition is an asynchronous operation that can take a variable
                // amount of time depending on the length of the text.
                // You may need to wait or retry this operation.
 
 // Wait for the read operation to finish, use the operationId to get the result.
      while (true) {
        const readOpResult = await computerVisionClient.getReadResult(operationIdUrl)
          .then((result) => {
            return result;
          })
        console.log('Read status: ' + readOpResult.status)
        if (readOpResult.status === STATUS_FAILED) {
          console.log('The Read File operation has failed.')
          break;
        }
        if (readOpResult.status === STATUS_SUCCEEDED) {
          console.log('The Read File operation was a success.');
          console.log();
          console.log('Read File URL image result:');
          // Print the text captured

          // Looping through: TextRecognitionResult[], then Line[]
          for (const textRecResult of readOpResult.analyzeResult.readResults) {
            for (const line of textRecResult.lines) {
              console.log(line.text)
            }
          }
          break;
        }
        await sleep(1000);
      }
      console.log();

. I also came across the similar discussion here.

.

Approach 2:

To integrate the Azure AI Vision API with the HTTP / Webhook API you can also refer this article.

.

Hope this helps.

Share via

Asynchronous OCR processing

Your answer