I currently use "Read Aloud: A Text to Speech Voice Reader" but the free version's voice is a bit robotic.
Can OpenAI Whisper be used as a plugin?
[edit]: An offline method generating audio for an input text is also fine (non-realtime TTS)
https://en.wikipedia.org/wiki/Pepper%27s_ghost
and think even a generic character with low-end motion capture and some out-of-the box motion animations should be "good enough" but I'd still need a voice actor to get acceptable vocals.
The trouble I see there isn't just that the default voice is "robotic" but that a real voice actor can take direction. You'd hope a voice actor has a good intuition for how to make a character come alive but you can always ask for adjustments. For current A.I. voices you can at best talk to the hand.
Reading text on a page is less demanding, but the system still has to adjust the tone of voice, prosody and such to match the emotional tone of the content. Maybe this can be done without emotion or a simulation of emotion on an end-to-end basis, but it has to be done.
https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_...
At the moment I'm relying on AWS primarily, it's got a couple good neural voices that I enjoy listening to, and then sync it up with S3 and a possible SNS (simple notification service). Glad to see someone else has seen a need for it, but I've also been thinking of how to do it agnostic of AWS.
It's possible at the moment for me to go into reader view, copy and paste the content into AWS polly interface, paste the bucket name, paste the SNS ARN, and then wait for it to finish, find it in the bucket and then open it.
I want that all in 3 steps, Start the Conversion, Find it easier in a better user interface, Hit Play.
And then from there, start implementing an SSML builder to modify the speed and prose of different paragraphs and stuff, but that's super far down the line.
[1] https://www.freedomscientific.com/products/software/jaws/
Note: text is choosen by the software to cover most sounds in the given language.
- https://speechelo-offer.com/
I think it uses the AWS API under the hood.
I know it is not Firefox, but Edge has it built-in and it works great! Highly recommended to try.
https://www.naturalreaders.com/
Google TTS
https://cloud.google.com/text-to-speech/
So far, the bests TTS tools I have found.
I believe Whisper is speech to text only