Voice Recognition

SpeechBrain – conversational AI toolkit

SpeechBrain is an all-in-one conversational AI toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognition, speech enhancement, speech separation, language identification, multi-microphone signal processing, and many others.

SpeechBrain supports both CPU and GPU computations. For most recipes, however, a GPU is necessary during training. Please note that CUDA must be properly installed to use GPUs.

This is free and open source software.

SpeechBrain supports state-of-the-art methods for end-to-end speech recognition:

  • Support of wav2vec 2.0 pretrained model with finetuning.
  • State-of-the-art performance or comparable with other existing toolkits in several ASR benchmarks.
  • Easily customizable neural language models, including RNNLM and TransformerLM. The project shares several pre-trained models that you can easily use. The Hugging Face dataset is supported to facilitate the training over a large text dataset.
  • Hybrid CTC/Attention end-to-end ASR:
    • Many available encoders: CRDNN (VGG + {LSTM,GRU,LiGRU} + DNN), ResNet, SincNet, vanilla transformers, context net-based transformers or conformers. Thanks to the flexibility of SpeechBrain, any fully customized encoder could be connected to the CTC/attention decoder and trained in a few hours of work. The decoder is fully customizable: LSTM, GRU, LiGRU, transformer, or your neural network!
    • Optimised and fast beam search on both CPUs and GPUs.
  • Transducer end-to-end ASR with both a custom Numba loss and the torchaudio one. Any encoder or decoder can be plugged into the transducer ranging from VGG+RNN+DNN to conformers.
  • Pre-trained ASR models for transcribing an audio file or extracting features for a downstream task.

Website: speechbrain.github.io
Support: github.com/speechbrain/speechbrain
Developer: Mirco Ravanelli, Parcollet Titouan and contributor
License: Apache License 2.0

SpeechBrain is written in Python. Learn Python with our recommended free books and free tutorials.


Related Software

Speech Recognition Tools
WhisperAutomatic speech recognition (system trained on 680,000 hours of data
FlashlightFast, flexible machine learning library written entirely in C++.
Coqui STTDeep-learning toolkit for training and deploying speech-to-text models
KaldiC++ toolkit designed for speech recognition researchers.
SpeechBrainAll-in-one conversational AI toolkit based on PyTorch
HandyOffline speech-to-text application
ESPnetEnd-to-End speech processing toolkit
deepspeech.pytorchImplementation of DeepSpeech2 using Baidu Warp-CTC.
WhisperingTranscription application with global speech-to-text functionality
JuliusTwo-pass large vocabulary continuous speech recognition engine
CMUSphinxSpeech recognition system for mobile and server applications
SimonFlexible speech recognition software
hyprwhsprNative speech-to-text designed for Arch / Omarchy
osttOpen Speech-to-Text
DeepSpeechTensorFlow implementation of Baidu's DeepSpeech architecture.
OpenSeq2SeqTensorFlow-based toolkit for sequence-to-sequence models
EesenEnd-to-End Speech Recognition

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments