RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
A generative speech model for daily dialogue.
Real-time system audio translation for macOS — translate any audio (YouTube, podcasts, meetings) live on screen with OpenAI or Google Gemini. On-device speech recognition; optional low-latency realtime mode.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
The open-source AI voice studio. Clone, dictate, create.
Port of OpenAI's Whisper model in C/C++
1 capture since 2026-05-27