Looking for a 24-7 Real-Time Voice Transcription Tool

Question

I'm on the hunt for a voice transcription tool that can operate continuously, in real-time, 24/7, to enhance my workflow. I need something that doesn't require constant starting and stopping.I've looked into a few options, but none of them seem to provide the non-stop, real-time transcription I'm after:- Siri/Google Assistant: They're great for short dictations, but they don't offer continuous transcription. - Otter.ai: This tool provides real-time transcription for meetings and interviews, but using it 24/7 could get expensive. - Rev Voice Recorder: It lacks real-time capabilities and needs to be manually activated. - The NSA: I won't be able to pass their security clearance because of that "classified" recording of my ex in bed... snoring like a freight train all night.Before I decide to build or modify a tool myself, I thought I'd ask here:Does anyone know of a tool that can provide 24/7 transcription?I've started sketching out some initial ideas here:https://github.com/8ta4/sayBut my main goal is to find out if such a tool already exists so I don't end up reinventing the wheel.

its-summertime · Accepted Answer

https://github.com/abb128/LiveCaptions comes to mind: The libraries and models for it are easily available for reworking it to be how you want, and can run 24/7 if you don't mind the cpu usage

tikkun · Answer

I'm interested in the same thing and have spent quite a bit of time looking.
Rewind.ai is ok (transcription accuracy is meh)
Voice Memos.app is ok (though no native transcription, and requires stopping and starting)
Otter.ai is ok (though there's a 4 hour limit on recordings, and there's no paid plan that allows for enough recording minutes to do 24/7)
My ideal solution would be that Otter comes out with a Pro 24/7 plan with 60,000 minutes per month and no max recording length, for $60-80/mo.
I would pay for this and have paid for alternatives, though I'd prefer to use an existing company that I've used for a while and that has lots of users, due to privacy/trust, or perhaps a small startup that publishes security reports and does everything on device.
As an aside:
I use 24/7 voice transcription as a kind of "extended context window" (to use an LLM analogy). While I'm working, I talk out loud to myself about what I'm thinking through, which I find allows me to effectively increase my working memory size to be much larger than otherwise. It's quite helpful.

wyldfire · Answer

> Before I decide to build or modify a tool myself, I thought I'd ask here:if you do decide to, start with ggml/whisper

noman-land · Answer

You can do this via command line with whisper.cpp.