HACKER Q&A
📣 bryanrasmussen

Looking for a Firefox compatible extension for voice controlled actions


Basically I would like to have an extension where I could say "click publish" and it will find the button with the text publish and click it.

Is there anything like this for FF, I ask because the FF voice repo got shut down a couple years ago and haven't found a competitor.


  👤 SCdF Accepted Answer ✓
I control my browser (actually my whole computer) with my voice.

I'm not aware of anything specific where you could say "click publish" and it would intelligently find the published button, though I haven't looked.

What I use is a combination of vimium (a vim browser plugin, there are a few) and talon (https://talonvoice.com/), either just relying on the vim shortcuts but with my voice, of for specific domains making talon shortcuts.

I have also heard good things about Rango, which is supposed to be a vimium like thing but much more voice focused: https://github.com/david-tejada/rango. I haven't tried to use it though.


👤 gebbie
I do all my computing by voice but just using normal keyboard-focused programs and context-free voice control.

I use the https://qutebrowser.org web browser, which is keyboard-focused, and https://numenvoice.org (which I also develop).


👤 tim1994
Assuming this does not yet exist: If anyone is interested in building something like this (that works offline) please let me know, I'd like to join. I have some experience building web extensions but unfortunately not the time to develop & maintain this on my own right now.

👤 invalid-access
Same here. Sadly, my dad is beginning to lose sight in both eyes now and has trouble operating essential websites (e.g. his bank's webpages). Does anyone know of browser extensions that enable a high-level voice command language? I'm thinking commands like:

* "Open Bank of America"

* "Login with saved password"

* "What's my savings bank balance?"

Yeah, I know it's fraught with security challenges and is likely acceptable only if using local models. Still, anything out there?


👤 DeTheBug
You may want to give https://serenade.ai a try, they have browser support (might be chrome only? not sure), it's good enough, It has the " click publish" feature, and if it doesn't work then you can say like "show inputs" and it will show numbers next to each input so instead of saying "click publish" you say "click 14" etc...

👤 BtM909
As an FYI: most Firefox also supports most Chrome Extensions, so it might help you to go through its store as well :)

👤 jvanveen

👤 osrec
Building this in JS would not be too difficult using the speech to text API, if that's available to you.

I'm not aware of an extension that does this out of the box though.


👤 devilsAdv0cate
You can still use Firefox Voice if you don't mind setting up your own STT server.

https://juliacambre.com/firefox-voice/