I've been doing research on voice-to-voice technologies and have made some attempts to train a model with Retrieval-Based Voice Conversion but I struggled to complete the process due to software bugs/dependencies I didn't have time to resolve.
I plan to make another attempt at training a model soon and would like to ask if anyone can recommend me any practical advice or resources for achieving this?
My personal use case is to replace the narrator of a purchased audio book with it's author (if possible) instead of the voice artist. It is common to find high quality speech available from authors to train a model with. It can be very distracting and frustrating to listen to books with accents and pronunciations that are unsuited to the material.
Thanks in advance
On KI I don't know to answer, but:
The narrator creates a german: Prosodie, the melody.
Extract it, so you have text and prosody as input for a synthetical voice of your own choice, including accents like American or Australian...
In need of a text to speech engine that takes not only text as input...
Christian, Dresden
Please excuse my tonality.