Coqui TTS

Advanced text-to-speech library with pretrained models in 1100+ languages and voice cloning tools

Repository activity

Stars2.3k
Forks286
Open Issues16

coqui-ai-tts health score - Linux Foundation Insights

License

MPL-2.0

Languages

Python
Jupyter Notebook
HTML

Get it:Website GitHub PyPI

About Coqui TTS

Coqui TTS is a library for advanced text-to-speech generation. It includes pretrained models for more than 1100 languages, plus tools for training new models and fine-tuning existing ones in any language. It also provides utilities for dataset analysis and curation.

It supports multi-speaker and multi-lingual synthesis, a Python API, a lower-level Synthesizer API, and a command line interface. Voice cloning can cache cloned voices with a custom speaker ID, and voice conversion models include knnvc, OpenVoice v1, and OpenVoice v2. A TTS server is available on localhost:5002, and Docker Compose is listed for deployment.

Coqui TTS is a fork of the original unmaintained TTS repository and is packaged as coqui-tts on PyPI. Prebuilt wheels are available for macOS, Windows, and Linux. The project also links to Docker images and documentation for GPU support and container setup.

Key features

Pretrained text-to-speech models for 1100+ languages
Train new models and fine-tune existing ones
Multi-speaker and multi-lingual synthesis
Voice cloning with cached cloned voices
Voice conversion with knnvc and OpenVoice models

Details

First released: 2023
Platforms: Windows · macOS · Linux
Deployment: self-hostable · docker
API: Python · CLI
Model types: TTS · voice conversion
Languages: 1100+