Machine Learning in Linux: Speech Note

Our Machine Learning in Linux series focuses on apps that make it easy to experiment with machine learning. All the apps covered in the series can be self-hosted.

Speech Note lets you take, read and translate notes in multiple languages. It combines the power of Speech to Text, Text to Speech and Machine Translation. Text and voice processing takes place entirely offline, locally on your computer, without using a network connection. Enhanced privacy is always a big advantage with self-hosted software.

Speech Note is a GUI frontend for various processing engines. For Speech to Text it uses Coqui STT, Vosk, and Whisper. Whisper is our highest rated speech recognition tool and features in our award-winning Top 100 CLI apps study. It’s that good. Coqui STT is also highly recommended although it’s no longer actively maintained.

For Text to Speech, Speech Note uses espeak-ng, MBROLA, Piper, RHVoice, and Coqui TTS. And the machine translation is handled by Bergamot Translator.

This is free and open source software written in C++.

Installation

Speech Note is available as a Flatpak via FlatHub.

To install the software, issue the command:

$ flatpak install flathub net.mkiol.SpeechNote

Once the installation is complete, we can run Speech Note from Activities in GNOME.

Speech Note has many build-time and run-time dependencies, so I wouldn’t recommend trying to build the source unless you’ve got time on your hands.

Next page: Page 2 – In Operation and Summary

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Pages: 1 2

Documents	Internet	Education
Audio	Video	Graphics
Admin	Desktop	Productivity
Science	Games	Security
Utilities	Coding	Finance
Web Apps	Other	Books

Google	Microsoft	Apple
Adobe	IBM	Autodesk
Oracle	Atlassian	Corel
Cisco	Intuit	SAS
Progress	Salesforce	Citrix