SpeechBrain - conversational AI toolkit

SpeechBrain is an all-in-one conversational AI toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, speech separation, language identification, multi-microphone signal processing, and many others.

SpeechBrain supports both CPU and GPU computations. For most recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.

This is free and open source software.

SpeechBrain supports state-of-the-art methods for end-to-end speech recognition:

Support of wav2vec 2.0 pretrained model with finetuning.
State-of-the-art performance or comparable with other existing toolkits in several ASR benchmarks.
Easily customizable neural language models, including RNNLM and TransformerLM. The project shares several pre-trained models that you can easily use. The Hugging Face dataset is supported to facilitate the training over a large text dataset.
Hybrid CTC/Attention end-to-end ASR:
- Many available encoders: CRDNN (VGG + {LSTM,GRU,LiGRU} + DNN), ResNet, SincNet, vanilla transformers, context net-based transformers or conformers. Thanks to the flexibility of SpeechBrain, any fully customized encoder could be connected to the CTC/attention decoder and trained in a few hours of work. The decoder is fully customizable: LSTM, GRU, LiGRU, transformer, or your neural network!
- Optimised and fast beam search on both CPUs and GPUs.
Transducer end-to-end ASR with both a custom Numba loss and the torchaudio one. Any encoder or decoder can be plugged into the transducer ranging from VGG+RNN+DNN to conformers.
Pre-trained ASR models for transcribing an audio file or extracting features for a downstream task.

Website: speechbrain.github.io
Support: github.com/speechbrain/speechbrain
Developer: Mirco Ravanelli, Parcollet Titouan and contributor
License: Apache License 2.0

SpeechBrain is written in Python. Learn Python with our recommended free books and free tutorials.

Related Software

Speech Recognition Tools
Whisper	Automatic speech recognition (system trained on 680,000 hours of data
Flashlight	Fast, flexible machine learning library written entirely in C++.
Coqui STT	Deep-learning toolkit for training and deploying speech-to-text models
Kaldi	C++ toolkit designed for speech recognition researchers.
SpeechBrain	All-in-one conversational AI toolkit based on PyTorch
Handy	Offline speech-to-text application
ESPnet	End-to-End speech processing toolkit
deepspeech.pytorch	Implementation of DeepSpeech2 using Baidu Warp-CTC.
Whispering	Transcription application with global speech-to-text functionality
Julius	Two-pass large vocabulary continuous speech recognition engine
CMUSphinx	Speech recognition system for mobile and server applications
Simon	Flexible speech recognition software
hyprwhspr	Native speech-to-text designed for Arch / Omarchy
ostt	Open Speech-to-Text
DeepSpeech	TensorFlow implementation of Baidu's DeepSpeech architecture.
OpenSeq2Seq	TensorFlow-based toolkit for sequence-to-sequence models
Eesen	End-to-End Speech Recognition

Read our verdict in the software roundup.

Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.

Documents	Internet	Education
Audio	Video	Graphics
Admin	Desktop	Productivity
Science	Games	Security
Utilities	Coding	Finance
Web Apps	Other	Books

Google	Microsoft	Apple
Adobe	IBM	Autodesk
Oracle	Atlassian	Corel
Cisco	Intuit	SAS
Progress	Salesforce	Citrix