HACKER Q&A
📣 pjerryhu

What's the best out-of-box Document OCR/Analyzing/recognition API?


I tried with Google Document AI and AWS Textract, they both seem a bit hard to use and AWS is especially pricy


  👤 westurner Accepted Answer ✓
"Show HN: BetterOCR combines and corrects multiple OCR engines with an LLM" (2023) https://news.ycombinator.com/context?id=38056243 :

junhoyeo/BetterOCR: https://github.com/junhoyeo/BetterOCR :

> Better text detection by combining multiple OCR engines (EasyOCR, Tesseract, and Pororo)


👤 constantinum
Unstract, they are a month away from open source launch of automating document based workflows. They also have plug and play APIs Link - https://unstract.com/api-for-unstructured-data/

👤 tmaly
I was using a Python module called easyocr. It was working for a while, but stopped recently.

I switched over to pytesseract, a wrapper for tesseract. It was pretty simple to use and I got it working in 10 minutes.


👤 freddealmeida
tesseract OCR is open source and very good. https://github.com/tesseract-ocr/tesseract

👤 symmetrist
Tesseract can run locally and you can also use other open source OCR models which use neural networks. I am pretty sure it is possible to fine-tune open source LLMs on images and get reasonable results.