Machine Learning in Linux: Speech Note

Artificial intelligence icon Our Machine Learning in Linux series focuses on apps that make it easy to experiment with machine learning. All the apps covered in the series can be self-hosted.

Speech Note lets you take, read and translate notes in multiple languages. It combines the power of Speech to Text, Text to Speech and Machine Translation. Text and voice processing takes place entirely offline, locally on your computer, without using a network connection. Enhanced privacy is always a big advantage with self-hosted software.

Speech Note is a GUI frontend for various processing engines. For Speech to Text it uses Coqui STT, Vosk, and Whisper. Whisper is our highest rated speech recognition tool and features in our award-winning Top 100 CLI apps study. It’s that good. Coqui STT is also highly recommended although it’s no longer actively maintained.

For Text to Speech, Speech Note uses espeak-ng, MBROLA, Piper, RHVoice, and Coqui TTS. And the machine translation is handled by Bergamot Translator.

This is free and open source software written in C++.


Speech Note is available as a Flatpak via FlatHub.

To install the software, issue the command:

$ flatpak install flathub net.mkiol.SpeechNote

Installing Speech Note

Once the installation is complete, we can run Speech Note from Activities in GNOME.

Speech Note has many build-time and run-time dependencies, so I wouldn’t recommend trying to build the source unless you’ve got time on your hands.

Next page: Page 2 – In Operation and Summary

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Notify of

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Inline Feedbacks
View all comments