VALL-E X

[VALL-E X, which can synthesize Japanese, English, and Chinese with a voice that sounds exactly like the user if given three seconds of audio, is still a threat; I tried and felt the OSS version of the technology that MS has made private (CloseBox) | Techno Edge TechnoEdge https://www.techno-edge.net/ article/2023/08/28/1812.html]

Text to Speech (TTS), voice cloning (using VALL-E X, Python, and PyTorch) given as a prompt (on Windows)

Try VALL-E-X with Orange Pi 5 | WASP Corporation


This page is auto-translated from [/nishio/VALL-E X](https://scrapbox.io/nishio/VALL-E X) using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I’m very happy to spread my thought to non-Japanese readers.