Instant voice cloning model with tone color cloning, style control, and cross-lingual speech generation
- Stars36.7k
- Forks4.1k
- Open Issues305
MIT
- Python
- Jupyter Notebook

About OpenVoice
OpenVoice is an open source instant voice cloning system for generating speech from a short reference audio clip. It focuses on cloning tone color, generating speech in multiple languages and accents, and supporting zero-shot cross-lingual voice cloning when the generated language and reference language are not in the training dataset.
The model provides voice style controls for emotion, accent, rhythm, pauses, and intonation. OpenVoice V2 adds better audio quality through a different training strategy and native multilingual support for English, Spanish, French, Chinese, Japanese, and Korean.
OpenVoice has powered the instant voice cloning feature in myshell.ai since May 2023. Its main contributors come from MIT, Tsinghua University, and MyShell. OpenVoice V1 and V2 are MIT licensed and free for commercial and research use.
Key features
- Tone color cloning from short reference audio clips
- Speech generation in multiple languages and accents
- Emotion, accent, rhythm, pause, and intonation control
- Zero-shot cross-lingual voice cloning
- Native V2 support for English, Spanish, French, Chinese, Japanese, and Korean
Details
- First released
- 2023
- License
- MIT
- Commercial use
- Free for commercial and research use
- Languages
- EN · ES · FR · ZH · JA · KO
- Style control
- Emotion · accent · rhythm · pauses
- Contributors
- MIT · Tsinghua · MyShell
