CMUSphinx - Open Source Speech Recognition System for Mobile and Server Applications

CMUSphinx (Sphinx) is a collective term to describe a group of speech recognition systems developed at Carnegie Mellon University.

CMUSphinx contains a number of packages for different tasks and applications:

Pocketsphinx — a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, written in C.
Sphinxbase — contains the basic libraries shared by the CMU Sphinx trainer and all the Sphinx decoders (Sphinx-II, Sphinx-III, and PocketSphinx), as well as some common utilities for manipulating acoustic feature and audio files.
Sphinx4 — a state-of-the-art, speaker-independent, continuous speech recognition system written in the Java programming language. The design of Sphinx-4 is based on patterns that have emerged from the design of past systems as well as new requirements based on areas that researchers currently want to explore. To exercise this framework, and to provide researchers with a “research-ready” system, Sphinx-4 also includes several implementations of both simple and state-of-the-art techniques.
Sphinxtrain — Carnegie Mellon University’s open source acoustic model trainer.
.

Key Features

State of art speech recognition algorithms for efficient speech recognition. CMUSphinx tools are designed specifically for low-resource platforms.
A flexible design.
Focuses on practical application development and not on research.
Wide range of tools for many speech-recognition related purposes (keyword spotting, alignment, pronunciation evaluation).
Support for several languages including English, French, Mandarin, German, Dutch, Russian, and the ability to build models for other languages.

Website: cmusphinx.github.io
Support: FAQ, GitHub
Developer: Many contributors
License: BSD-like license

Sphinx is written in Java. Learn Java with our recommended free books and free tutorials.

Related Software

Speech Recognition Tools
Whisper	Automatic speech recognition (system trained on 680,000 hours of data
Flashlight	Fast, flexible machine learning library written entirely in C++.
Coqui STT	Deep-learning toolkit for training and deploying speech-to-text models
Kaldi	C++ toolkit designed for speech recognition researchers.
SpeechBrain	All-in-one conversational AI toolkit based on PyTorch
Handy	Offline speech-to-text application
ESPnet	End-to-End speech processing toolkit
deepspeech.pytorch	Implementation of DeepSpeech2 using Baidu Warp-CTC.
Whispering	Transcription application with global speech-to-text functionality
Julius	Two-pass large vocabulary continuous speech recognition engine
CMUSphinx	Speech recognition system for mobile and server applications
Simon	Flexible speech recognition software
hyprwhspr	Native speech-to-text designed for Arch / Omarchy
ostt	Open Speech-to-Text
DeepSpeech	TensorFlow implementation of Baidu's DeepSpeech architecture.
OpenSeq2Seq	TensorFlow-based toolkit for sequence-to-sequence models
Eesen	End-to-End Speech Recognition

Read our verdict in the software roundup.

Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.

Documents	Internet	Education
Audio	Video	Graphics
Admin	Desktop	Productivity
Science	Games	Security
Utilities	Coding	Finance
Web Apps	Other	Books

Google	Microsoft	Apple
Adobe	IBM	Autodesk
Oracle	Atlassian	Corel
Cisco	Intuit	SAS
Progress	Salesforce	Citrix