Is there demand for a local-first live transcription tool?

Question

As a hearing impaired software engineer, I have built my own local-first solution to transcribe entirely locally in real time word by word. It's my daily driver for transcribing meetings, interviews, etc. Because of its local-first capability, I do not have to worry about privacy concerns when transcribing meetings at work as all data stays on my machine. It's about as fast as Otter.ai although there's definitely room for improvements in terms of UX and speed. Caveat is that it only works on MacBooks with Apple silicon, fronted by a very simple TUI.I am thinking of putting it up as an open source project on GitHub to garner interest. If interest is high, commercializing it as a product would be a next step. Before I go too deep in that direction, I'm curious about the market demand. How does the HN community think about the demand for a local-first live transcription tool?

talldayo · Accepted Answer

> If interest is high, commercializing it as a product would be a next step
Why? OpenAI's Whisper is a free local model with native GUIs on most operating systems. I wrote one myself in a weekend, it's one of hundreds of similar GUIs that exist.
What functionality would you be offering that people can't otherwise get for free? This is a pretty competitive space, trying to sell a TUI is going to leave you gravely disappointed by the sales (unless it's extremely impressive).

yen223 · Answer

The concern I have for commercialising this product is that there's a very high chance Apple will roll this out as a feature in macOS 17 or 18. It seems like a natural evolution of Apple Intelligence.Edit: Just noticed that Apple Notes in Sequoia already has "Live audio transcription"

sz429961 · Answer

I built one and use it daily myself, it uses Whisper for local transcription and Llama 3.2 for local AI conversation, it has good UX as well, support live record or importing any audio/video files.https://apps.apple.com/us/app/notes-ai/id6477414161

oulipo · Answer

There's the MacWhisper app which seems to be used a lot, so perhaps you could integrate, or do something similar? I guess if you open-source it the MacWhisper dev can also integrate it

satvikpendem · Answer

What model are you using? Whisper Turbo can do real time transcription and there are implementations that are pretty fast.

sfmz · Answer

Is it stronger/better than Real-time Whisper WebGPU ?https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu

compressedgas · Answer

This reminded me of https://github.com/usefulsensors/moonshine

mkbkn · Answer

I would love a local transcription app that works on Android and Linux.

masterj · Answer

Yes, my partner is hearing-impaired and I&rsquo;ve been looking for a tool like this for them

Is there demand for a local-first live transcription tool?

The concern I have for commercialising this product is that there's a very high chance Apple will roll this out as a feature in macOS 17 or 18. It seems like a natural evolution of Apple Intelligence.
Edit: Just noticed that Apple Notes in Sequoia already has "Live audio transcription"

I built one and use it daily myself, it uses Whisper for local transcription and Llama 3.2 for local AI conversation, it has good UX as well, support live record or importing any audio/video files.
https://apps.apple.com/us/app/notes-ai/id6477414161

There's the MacWhisper app which seems to be used a lot, so perhaps you could integrate, or do something similar? I guess if you open-source it the MacWhisper dev can also integrate it

What model are you using? Whisper Turbo can do real time transcription and there are implementations that are pretty fast.

Is it stronger/better than Real-time Whisper WebGPU ?
https://huggingface.co/spaces/Xenova/realtime-whisper-webgpu

This reminded me of https://github.com/usefulsensors/moonshine

I would love a local transcription app that works on Android and Linux.

Yes, my partner is hearing-impaired and I’ve been looking for a tool like this for them