Does somebody here know the technical reasons?
I find speech-to-text dictation a productive way to crank out a draft. So far, voice typing is the easiest approach on Linux that I've found. But I'd prefer Firefox. Any other suggestions for easy STT on Linux? Punctuation identification is preferred but optional. It's a Kubuntu system with only CPU, no GPU.
IMO it's good if we don't make web platform features unfeasible to implement locally, so I think mozillas stance makes sense. A new browser should not have to have the backing of a massive cloud provider willing to give away compute for free.
And just FYI I'm working on turning my state of the art speech recognition paper (TEVR, token entropy variance reduction) into a developer-friendly binary which will run offline on your GPU and, hence, be fully private, and offer a scripting API on localhost. Testing WER on LibriSpeech clean is 2.3, so slightly worse than an offline wav2vec2 1B, but years ahead of Kaldi, Vosk, coqui and the usual streaming cloud services.
I estimate it'll be 2 more months until I can post Linux binaries. Won't be open source, though.
https://github.com/ideasman42/nerd-dictation
get speech models here:
https://github.com/alphacep/vosk-api
HN discussion:
Google systematically abuse their position to make Firefox less appealing to people