Open-source PyTorch toolkit for speech and text processing across conversational AI tasks
Apache-2.0
- Python
- Perl
- MATLAB

About SpeechBrain
SpeechBrain is a PyTorch toolkit for building conversational AI systems. It covers speech assistants, chatbots, and speech and text processing workflows, with support for speech recognition, speaker recognition, speech enhancement, speech separation, and language modeling.
It ships over 200 training recipes on more than 40 datasets across 20 speech and text processing tasks. You can train from scratch or fine-tune pretrained models, and Hugging Face models can be plugged in for training and inference. The toolkit also provides an inference interface for transcribing audio from Python.
SpeechBrain is written in Python and installs with pip, with recipes ready to run for experiments and model training. It is licensed under Apache-2.0, and pretrained models are published on Hugging Face.
Key features
- Over 200 training recipes across speech and text tasks
- Training from scratch or fine-tuning pretrained models
- Hugging Face models can be plugged in for training
- Python inference interface for ASR transcription
- Speech recognition, enhancement, separation, and language modeling
Details
- First released
- 2020
- License
- Apache-2.0
- Training recipes
- Over 200
- Datasets
- More than 40
- Tasks
- 20 speech and text tasks
- Language
- Python
