SpeechBrain

Open-source PyTorch toolkit for speech and text processing across conversational AI tasks

Repository activity

Stars11.6k
Forks1.7k
Open Issues184

speechbrain health score - Linux Foundation Insights

License

Apache-2.0

Languages

Python
Perl
MATLAB

Get it:Website GitHub

About SpeechBrain

SpeechBrain is a PyTorch toolkit for building conversational AI systems. It covers speech assistants, chatbots, and speech and text processing workflows, with support for speech recognition, speaker recognition, speech enhancement, speech separation, and language modeling.

It ships over 200 training recipes on more than 40 datasets across 20 speech and text processing tasks. You can train from scratch or fine-tune pretrained models, and Hugging Face models can be plugged in for training and inference. The toolkit also provides an inference interface for transcribing audio from Python.

SpeechBrain is written in Python and installs with pip, with recipes ready to run for experiments and model training. It is licensed under Apache-2.0, and pretrained models are published on Hugging Face.

Key features

Over 200 training recipes across speech and text tasks
Training from scratch or fine-tuning pretrained models
Hugging Face models can be plugged in for training
Python inference interface for ASR transcription
Speech recognition, enhancement, separation, and language modeling

Details

First released: 2020
License: Apache-2.0
Training recipes: Over 200
Datasets: More than 40
Tasks: 20 speech and text tasks
Language: Python