My current setup is:
- Scanner hooked up to a raspberry pi
- Push button on the scanner, the pi scans it in and saves a .tif file per page to my nas
- A script running in a k8s pod monitors that folder, performs some steps on each .tif file like increasing the contrast, cutting off the edges, detecting blank pages, etc
- The same script then converts those .tif files to a pdf and runs Tesseract on it for OCR
- That pdf gets uploaded to a folder in my nextcloud instance
It's not great, but I can either use my local file explorer to search through the ocr'd PDFs or (more slowly) I can search inside Nextcloud's web ui using the fulltextsearch plugin.
And this slightly older FileBasedMiniDMS: https://github.com/stweiss/FileBasedMiniDMS
I tried both Mayan and Paperless (regular) myself to replace my Evernote Premium setup, they haven't convinced me yet.
I am currently trying out https://github.com/jbarlow83/OCRmyPDF myself (had to fork it to add my own language to the Dockerfile) and then will either let my NAS index it afterwards or Dropbox/Nextcloud maybe. Apparently locally they get indexed very well with either Gnome (Linux) or Finder (Mac) or Explorer (Windows).
I use Swiftscan Pro on iOS to take pics of receipts or single page documents. It OCRs and pushes them into Evernote.
I use PDFPen Pro on Mac to OCR longer documents scanned in using the office scannner/printer. This is triggered when I drop a file into a monitored folder. My applescript fires PDFPen, performs OCR and then imports into Evernote.
I have another monitored folder to import PDFs that don't require OCR (just import into Evernote)
And lastly if I get an email with a PDF attachment, I forward it to my special Evernote email address where it's automatically imported.
The main reason I haven't moved away from Evernote is because I want access to my files on all my devices and at any possible moment, and I want the service I use to outlive me. Evernote so far hasn't failed on either of these promises (though the latter does possibly worry me)