Advanced text-to-speech library with pretrained models in 1100+ languages and voice cloning tools
MPL-2.0
- Python
- Jupyter Notebook
- HTML

About Coqui TTS
Coqui TTS is a library for advanced text-to-speech generation. It includes pretrained models for more than 1100 languages, plus tools for training new models and fine-tuning existing ones in any language. It also provides utilities for dataset analysis and curation.
It supports multi-speaker and multi-lingual synthesis, a Python API, a lower-level Synthesizer API, and a command line interface. Voice cloning can cache cloned voices with a custom speaker ID, and voice conversion models include knnvc, OpenVoice v1, and OpenVoice v2. A TTS server is available on localhost:5002, and Docker Compose is listed for deployment.
Coqui TTS is a fork of the original unmaintained TTS repository and is packaged as coqui-tts on PyPI. Prebuilt wheels are available for macOS, Windows, and Linux. The project also links to Docker images and documentation for GPU support and container setup.
Key features
- Pretrained text-to-speech models for 1100+ languages
- Train new models and fine-tune existing ones
- Multi-speaker and multi-lingual synthesis
- Voice cloning with cached cloned voices
- Voice conversion with knnvc and OpenVoice models
Details
- First released
- 2023
- Platforms
- Windows · macOS · Linux
- Deployment
- self-hostable · docker
- API
- Python · CLI
- Model types
- TTS · voice conversion
- Languages
- 1100+
