Handy is a cross-platform desktop speech-to-text application that lets you dictate directly into any text field using configurable keyboard shortcuts.
It’s designed for privacy-focused local transcription, runs entirely on your own computer rather than sending audio to the cloud, and supports a range of speech recognition models so you can balance speed, language coverage, and accuracy to suit your system.
This is free and open source software.
Key Features
- Performs speech transcription entirely offline, keeping audio processing on your own machine.
- Lets you start and stop recording with configurable keyboard shortcuts, with both push-to-talk and toggle modes available.
- Supports multiple local recognition models, including Whisper, Parakeet, Moonshine, Canary, SenseVoice, and GigaAM, along with support for custom Whisper-compatible models.
- Stores transcription history with timestamps, audio playback, copy and delete actions, starring, and automatic cleanup options for recordings.
- Includes advanced output controls such as auto-submit after insertion, clipboard handling, trailing spaces, and custom word correction for commonly misheard terms.
- Offers an experimental post-processing feature that can refine grammar, reformat text, or translate output using local or external AI providers.
Website: github.com/cjpais/handy
Support:
Developer: CJ Pais
License: MIT License
Handy is written in Python. Learn Python with our recommended free books and free tutorials.
Related Software
| Speech Recognition Tools | |
|---|---|
| Whisper | Automatic speech recognition (system trained on 680,000 hours of data |
| Flashlight | Fast, flexible machine learning library written entirely in C++. |
| Coqui STT | Deep-learning toolkit for training and deploying speech-to-text models |
| Kaldi | C++ toolkit designed for speech recognition researchers. |
| SpeechBrain | All-in-one conversational AI toolkit based on PyTorch |
| ESPnet | End-to-End speech processing toolkit |
| deepspeech.pytorch | Implementation of DeepSpeech2 using Baidu Warp-CTC. |
| DeepSpeech | TensorFlow implementation of Baidu's DeepSpeech architecture. |
| Julius | Two-pass large vocabulary continuous speech recognition engine |
| OpenSeq2Seq | TensorFlow-based toolkit for sequence-to-sequence models |
| CMUSphinx | Speech recognition system for mobile and server applications |
| Eesen | End-to-End Speech Recognition |
| Simon | Flexible speech recognition software |
Read our verdict in the software roundup.
Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk. You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more. Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form. |

