Voice Recognition

CMUSphinx – Open Source Speech Recognition System for Mobile and Server Applications

CMUSphinx (Sphinx) is a collective term to describe a group of speech recognition systems developed at Carnegie Mellon University.

CMUSphinx contains a number of packages for different tasks and applications:

  • Pocketsphinx — a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, written in C.
  • Sphinxbase — contains the basic libraries shared by the CMU Sphinx trainer and all the Sphinx decoders (Sphinx-II, Sphinx-III, and PocketSphinx), as well as some common utilities for manipulating acoustic feature and audio files.
  • Sphinx4 — a state-of-the-art, speaker-independent, continuous speech recognition system written in the Java programming language. The design of Sphinx-4 is based on patterns that have emerged from the design of past systems as well as new requirements based on areas that researchers currently want to explore. To exercise this framework, and to provide researchers with a “research-ready” system, Sphinx-4 also includes several implementations of both simple and state-of-the-art techniques.
  • Sphinxtrain — Carnegie Mellon University’s open source acoustic model trainer.
    .

Key Features

  • State of art speech recognition algorithms for efficient speech recognition. CMUSphinx tools are designed specifically for low-resource platforms.
  • A flexible design.
  • Focuses on practical application development and not on research.
  • Wide range of tools for many speech-recognition related purposes (keyword spotting, alignment, pronunciation evaluation).
  • Support for several languages including English, French, Mandarin, German, Dutch, Russian, and the ability to build models for other languages.

Website: cmusphinx.github.io
Support: FAQ, GitHub
Developer: Many contributors
License: BSD-like license

Sphinx is written in Java. Learn Java with our recommended free books and free tutorials.


Related Software

Speech Recognition Tools
WhisperAutomatic speech recognition (system trained on 680,000 hours of data
FlashlightFast, flexible machine learning library written entirely in C++.
Coqui STTDeep-learning toolkit for training and deploying speech-to-text models
KaldiC++ toolkit designed for speech recognition researchers.
SpeechBrainAll-in-one conversational AI toolkit based on PyTorch
HandyOffline speech-to-text application
ESPnetEnd-to-End speech processing toolkit
deepspeech.pytorchImplementation of DeepSpeech2 using Baidu Warp-CTC.
WhisperingTranscription application with global speech-to-text functionality
JuliusTwo-pass large vocabulary continuous speech recognition engine
CMUSphinxSpeech recognition system for mobile and server applications
SimonFlexible speech recognition software
hyprwhsprNative speech-to-text designed for Arch / Omarchy
osttOpen Speech-to-Text
DeepSpeechTensorFlow implementation of Baidu's DeepSpeech architecture.
OpenSeq2SeqTensorFlow-based toolkit for sequence-to-sequence models
EesenEnd-to-End Speech Recognition

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Alan Devery
Alan Devery
8 years ago

PocketSphinx looks very interesting, particualry its fixed-point arithmetic and efficient algorithms for GMM computation.