Since the AI voice is trained shouldn't a reversing AI also be able to seperate the trained data?
[0] https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI [1] https://elevenlabs.io/speech-synthesis
TTS is just text-to-speech, and there is a huge amount of algorithms and tools for speech-to-text (STT).
I think so. There's a whole field of voice biometrics working in this area. I've experimented with such tools and you have to work hard to copy someone's vocabulary, timing, and cadence. If you speak or sing in your normal voice and convert it, there are huge tells, somewhat akin to those used in stylometry to identify the owners of sock accounts (indeed, if someone actually used TTS, it mostly becomes a stylometry problem, unless services like ElevenLabs were to add inaudible watermarks or something).