The OCR engine just doesn't recognize some digits

Hagara 0 Reputation points
2024-06-30T15:22:17.7833333+00:00

Tried the online version of your OCR engine (Vision Studio) with a scanned database printout.
Occasionally some digits (usually the last one of a number) simply gets lost.
The source image is clearly readable.
Why can this be the case?
One would think to recognize a number is the absolutely minimal requirement of any OCR engine.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,500 questions
{count} votes

1 answer

Sort by: Most helpful
  1. hossein jalilian 4,690 Reputation points
    2024-06-30T23:11:14.2533333+00:00

    Thanks for posting your question in the Microsoft Q&A forum.

    The issue you're experiencing with occasional missing digits in OCR results, particularly the last digit of numbers, can occur due to several factors:

    • Even if the image appears readable to the human eye, the OCR engine may struggle with lower resolution or slightly blurred images. Ensure your scans are high-quality and have sufficient resolution (at least 300 DPI).
    • If characters are too close together or touching, the engine might incorrectly segment them, leading to missed digits.
    • Certain fonts or typefaces can be challenging for OCR engines, especially if they have unusual shapes for digits or if the characters are very thin.
    • Small specks, lines, or other artifacts near the digits can confuse the OCR engine, causing it to misinterpret or skip characters.
    • Poor contrast between the text and background can make it difficult for the OCR engine to distinguish characters accurately

    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful

    0 comments No comments