Speech

eSpeak NG – text-to-speech software

In Operation

eSpeak supports more than 100 languages and accents. Let’s hear the default voice:

Default voice
$ espeak-ng "Welcome to LinuxLinks.com, the home of great software. We hope you enjoy our reviews" -w espeak-NG-output.wav

The sound generated by eSpeak NG’s default voice comes across as very robotic and uncomfortable to listen to for any length of time. We’re using the -w flag to produce the speech output as a WAV file. If we omit the flag, the sound is played direct.

Let’s read some longer text:

$ espeak-ng -f linux-intro -w espeak-NG-linux-intro.wav

Compare that to the sample generated by Tortoise.

The sample created with Tortoise is using different voices.

eSpeak NG offers different voices too. The voices stored at /usr/lib/x86_64-linux-gnu/espeak-ng-data/voices/  But the ones in the !v subdirectory just generate a segmentation fault on our Ubuntu test machine. They also weren’t successful under Manjaro.

What else does eSpeak NG offer?

  • Alter the amplitude, word gap, pitch, and speed.
  • SSML (Speech Synthesis Markup Language) is supported (not complete), and also HTML.
  • Compact size. The program and its data, including many languages, totals about few Mbytes.
  • Can be used as a front-end to MBROLA diphone voices. eSpeak NG converts text to phonemes with pitch and length information.
  • Can translate text into phoneme codes, so it could be adapted as a front end for another speech synthesis engine.

Summary

eSpeak NG uses a “formant synthesis” method. This allows many languages to be provided in a small size. The speech is clear, and can be used at high speeds, but is not as natural or smooth as larger synthesizers which are based on human speech recordings. It also supports Klatt formant synthesis, and the ability to use MBROLA as backend speech synthesizer.

It’s difficult to recommend eSpeak NG when taking into account the deep learning alternatives such as Tortoise and Coqui TTS.

Website: github.com/espeak-ng/espeak-ng
Support:
Developer: Reece H. Dunn, Valdis Vitolins, Juho Hiltunen, and other contributors
License: GNU General Public License v3.0

eSpeak NG is written in C. Learn C with our recommended free books and free tutorials.

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments