Press shortcut → speak → get text. Desktop transcription that cuts out the middleman.
Read more
The Linux Portal Site
Press shortcut → speak → get text. Desktop transcription that cuts out the middleman.
Read more
sherpa-onnx is software for Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime
Read more
Simon is open source speech recognition software which aims to be flexible and highly customizable.
Read more
Eesen is to simplify the existing complicated, expertise-intensive ASR pipeline into a straightforward sequence learning problem.
Read more
CMUSphinx (Sphinx) is a collective term to describe a group of speech recognition systems developed at Carnegie Mellon University.
Read more
Julius is a high-performance, two-pass large vocabulary continuous speech recognition (LVCSR) engine. It supports N-gram based dictation.
Read more
OpenSeq2Seq is a toolkit for distributed and mixed precision training of sequence-to-sequence models.
Read more
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques.
Read more
deepspeech.pytorch is an implementation of DeepSpeech2 using Baidu Warp-CTC. It creates a network based on the DeepSpeech2 architecture.
Read more
ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech.
Read more
Kaldi is a state-of-the-art speech recognition toolkit written in C++. It’s intended to be used mainly for acoustic modelling research.
Read more
SpeechBrain is an all-in-one conversational AI toolkit based on PyTorch. This is free and open source software written in Python.
Read more
Flashlight is a fast, flexible machine learning library written entirely in C++. It provides apps for research across multiple domains.
Read more