Coqui TTS logo

Coqui TTS

Advanced text-to-speech library with pretrained models in 1100+ languages and voice cloning tools

Repository activity
  • Stars2.3k
  • Forks286
  • Open Issues16
coqui-ai-tts health score - Linux Foundation Insights
License

MPL-2.0

Languages
  • Python
  • Jupyter Notebook
  • HTML
Coqui TTS screenshot

About Coqui TTS

Coqui TTS is a library for advanced text-to-speech generation. It includes pretrained models for more than 1100 languages, plus tools for training new models and fine-tuning existing ones in any language. It also provides utilities for dataset analysis and curation.

It supports multi-speaker and multi-lingual synthesis, a Python API, a lower-level Synthesizer API, and a command line interface. Voice cloning can cache cloned voices with a custom speaker ID, and voice conversion models include knnvc, OpenVoice v1, and OpenVoice v2. A TTS server is available on localhost:5002, and Docker Compose is listed for deployment.

Coqui TTS is a fork of the original unmaintained TTS repository and is packaged as coqui-tts on PyPI. Prebuilt wheels are available for macOS, Windows, and Linux. The project also links to Docker images and documentation for GPU support and container setup.

Key features

  • Pretrained text-to-speech models for 1100+ languages
  • Train new models and fine-tune existing ones
  • Multi-speaker and multi-lingual synthesis
  • Voice cloning with cached cloned voices
  • Voice conversion with knnvc and OpenVoice models

Details

First released
2023
Platforms
Windows · macOS · Linux
Deployment
self-hostable · docker
API
Python · CLI
Model types
TTS · voice conversion
Languages
1100+