SpeechBrain logo

SpeechBrain

Open-source PyTorch toolkit for speech and text processing across conversational AI tasks

Repository activity
  • Stars11.6k
  • Forks1.7k
  • Open Issues184
speechbrain health score - Linux Foundation Insights
License

Apache-2.0

Languages
  • Python
  • Perl
  • MATLAB
SpeechBrain screenshot

About SpeechBrain

SpeechBrain is a PyTorch toolkit for building conversational AI systems. It covers speech assistants, chatbots, and speech and text processing workflows, with support for speech recognition, speaker recognition, speech enhancement, speech separation, and language modeling.

It ships over 200 training recipes on more than 40 datasets across 20 speech and text processing tasks. You can train from scratch or fine-tune pretrained models, and Hugging Face models can be plugged in for training and inference. The toolkit also provides an inference interface for transcribing audio from Python.

SpeechBrain is written in Python and installs with pip, with recipes ready to run for experiments and model training. It is licensed under Apache-2.0, and pretrained models are published on Hugging Face.

Key features

  • Over 200 training recipes across speech and text tasks
  • Training from scratch or fine-tuning pretrained models
  • Hugging Face models can be plugged in for training
  • Python inference interface for ASR transcription
  • Speech recognition, enhancement, separation, and language modeling

Details

First released
2020
License
Apache-2.0
Training recipes
Over 200
Datasets
More than 40
Tasks
20 speech and text tasks
Language
Python