HACKER Q&A
📣 8ta4

Looking for a 24-7 Real-Time Voice Transcription Tool


I'm on the hunt for a voice transcription tool that can operate continuously, in real-time, 24/7, to enhance my workflow. I need something that doesn't require constant starting and stopping.

I've looked into a few options, but none of them seem to provide the non-stop, real-time transcription I'm after:

- Siri/Google Assistant: They're great for short dictations, but they don't offer continuous transcription. - Otter.ai: This tool provides real-time transcription for meetings and interviews, but using it 24/7 could get expensive. - Rev Voice Recorder: It lacks real-time capabilities and needs to be manually activated. - The NSA: I won't be able to pass their security clearance because of that "classified" recording of my ex in bed... snoring like a freight train all night.

Before I decide to build or modify a tool myself, I thought I'd ask here:

Does anyone know of a tool that can provide 24/7 transcription?

I've started sketching out some initial ideas here:

https://github.com/8ta4/say

But my main goal is to find out if such a tool already exists so I don't end up reinventing the wheel.


  👤 its-summertime Accepted Answer ✓
https://github.com/abb128/LiveCaptions comes to mind: The libraries and models for it are easily available for reworking it to be how you want, and can run 24/7 if you don't mind the cpu usage

👤 tikkun
I'm interested in the same thing and have spent quite a bit of time looking.

Rewind.ai is ok (transcription accuracy is meh)

Voice Memos.app is ok (though no native transcription, and requires stopping and starting)

Otter.ai is ok (though there's a 4 hour limit on recordings, and there's no paid plan that allows for enough recording minutes to do 24/7)

My ideal solution would be that Otter comes out with a Pro 24/7 plan with 60,000 minutes per month and no max recording length, for $60-80/mo.

I would pay for this and have paid for alternatives, though I'd prefer to use an existing company that I've used for a while and that has lots of users, due to privacy/trust, or perhaps a small startup that publishes security reports and does everything on device.

As an aside:

I use 24/7 voice transcription as a kind of "extended context window" (to use an LLM analogy). While I'm working, I talk out loud to myself about what I'm thinking through, which I find allows me to effectively increase my working memory size to be much larger than otherwise. It's quite helpful.


👤 wyldfire
> Before I decide to build or modify a tool myself, I thought I'd ask here:

if you do decide to, start with ggml/whisper


👤 noman-land
You can do this via command line with whisper.cpp.