OpenVoice logo

OpenVoice

Instant voice cloning model with tone color cloning, style control, and cross-lingual speech generation

Repository activity
  • Stars36.7k
  • Forks4.1k
  • Open Issues305
License

MIT

Languages
  • Python
  • Jupyter Notebook
OpenVoice screenshot

About OpenVoice

OpenVoice is an open source instant voice cloning system for generating speech from a short reference audio clip. It focuses on cloning tone color, generating speech in multiple languages and accents, and supporting zero-shot cross-lingual voice cloning when the generated language and reference language are not in the training dataset.

The model provides voice style controls for emotion, accent, rhythm, pauses, and intonation. OpenVoice V2 adds better audio quality through a different training strategy and native multilingual support for English, Spanish, French, Chinese, Japanese, and Korean.

OpenVoice has powered the instant voice cloning feature in myshell.ai since May 2023. Its main contributors come from MIT, Tsinghua University, and MyShell. OpenVoice V1 and V2 are MIT licensed and free for commercial and research use.

Key features

  • Tone color cloning from short reference audio clips
  • Speech generation in multiple languages and accents
  • Emotion, accent, rhythm, pause, and intonation control
  • Zero-shot cross-lingual voice cloning
  • Native V2 support for English, Spanish, French, Chinese, Japanese, and Korean

Details

First released
2023
License
MIT
Commercial use
Free for commercial and research use
Languages
EN · ES · FR · ZH · JA · KO
Style control
Emotion · accent · rhythm · pauses
Contributors
MIT · Tsinghua · MyShell