Asynchronous OCR processing
Is there a way to process OCR files (Vision 4.0 or Document Intelligence) asynchronously, using a webhook callback instead of polling for completion?
Azure Computer Vision
Azure AI Document Intelligence
-
navba-MSFT 24,890 Reputation points • Microsoft Employee
2024-09-26T05:51:09.9266667+00:00 @victor uceda uceda Welcome to Microsoft Q&A Forum, Thank you for posting your query here!
.
You can use Azure Computer Vision for OCR with asynchronous processing. You can set up an asynchronous workflow, but native webhook callbacks are not directly supported. However, you can implement a webhook callback mechanism yourself to avoid continuous polling.
.
Steps to Achieve Asynchronous OCR Processing Using a Webhook
Submit the OCR Request Asynchronously
- Use the POST request for the OCR operation (either in Document Intelligence or Azure Computer Vision).
- Include the webhook URL in your request payload so that Azure can notify you when the processing is complete.
Set Up Your Webhook Endpoint
- Create an endpoint in your application that can receive HTTP POST requests.
- This endpoint will handle the incoming POST request from Azure once the OCR processing is complete.
Handle the Webhook Callback
- When Azure completes the OCR processing, it will send a POST request to your webhook URL with the results.
- Your webhook endpoint should process the incoming data and handle the OCR results as needed.
.
The below example will use a webhook callback to avoid continuous polling.
Step 1: OCR submit sample:
var ocrRequestUrl = "https://<your-azure-endpoint>/vision/v4.0/read/analyze"; var documentPayload = new { url = "<url-of-your-document>", webhookUrl = "https://your-webhook-url" }; var content = new StringContent(JsonConvert.SerializeObject(documentPayload), Encoding.UTF8, "application/json"); var response = await client.PostAsync(ocrRequestUrl, content);
.
Step 2: Webhook Endpoint to Handle Callback
[HttpPost] public async Task<IActionResult> WebhookCallback([FromBody] OcrResult result) { if (result.Status == "succeeded") { // Process the OCR results var ocrData = result.AnalyzeResult; // Do something with the OCR data } return Ok(); }
. This approach eliminates the need for continuous polling and leverages webhooks to handle the OCR results asynchronously. .
Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.
-
victor uceda uceda 0 Reputation points
2024-09-26T15:18:55.8866667+00:00 Hi, thanks for your reply!
but.... are you sure? also in the document intelligence models... I tried it without success, I used this nodeJS code:`${endpoint}documentintelligence/documentModels/prebuilt-read:analyze?_overload=analyzeDocument&api-version=2024-07-31-preview`, { url: documentUrl, //webhookUrl: webhookUrl, }, { headers: { 'Ocp-Apim-Subscription-Key': subscriptionKey, 'Content-Type': 'application/json', }, }, );
with the webhookUrl parameter the response is an 400, statusText: 'Bad Request', 'ms-azure-ai-errorcode': 'InvalidArgument' (without including webhookUrl ends ok).
Am I doing something wrong? -
navba-MSFT 24,890 Reputation points • Microsoft Employee
2024-09-27T10:34:03.2133333+00:00 @victor uceda uceda I haven't tested my above suggestion. Since you have tried it and it is not working, we will follow the alternative approach.
.
Approach 1:
You can try to modify the implementation you mentioned and use a while loop that continuously checks if the result is available, along with a mechanism to cancel the wait, either after a maximum waiting time or a set number of retries.
Here is an example from our Computer Vision GitHub sample:
// If the first REST API method completes successfully, the second // REST API method retrieves the text written in the image. // // Note: The response may not be immediately available. Text // recognition is an asynchronous operation that can take a variable // amount of time depending on the length of the text. // You may need to wait or retry this operation. // Wait for the read operation to finish, use the operationId to get the result. while (true) { const readOpResult = await computerVisionClient.getReadResult(operationIdUrl) .then((result) => { return result; }) console.log('Read status: ' + readOpResult.status) if (readOpResult.status === STATUS_FAILED) { console.log('The Read File operation has failed.') break; } if (readOpResult.status === STATUS_SUCCEEDED) { console.log('The Read File operation was a success.'); console.log(); console.log('Read File URL image result:'); // Print the text captured // Looping through: TextRecognitionResult[], then Line[] for (const textRecResult of readOpResult.analyzeResult.readResults) { for (const line of textRecResult.lines) { console.log(line.text) } } break; } await sleep(1000); } console.log();
. I also came across the similar discussion here.
.
Approach 2:
To integrate the Azure AI Vision API with the HTTP / Webhook API you can also refer this article.
.
Hope this helps.
Sign in to comment