We can access our newly created virtual environment with the command:
$ source .venv/bin/activate
Let’s now launch the Mimic 3 server with the command:
(.venv) $ mimic3-server
You’ll see output like this:
INFO:mimic3_http.__main__:Starting web server [2022-11-14 08:43:48 +0000]  [INFO] Running on http://0.0.0.0:59125 (CTRL + C to quit) INFO:hypercorn.error:Running on http://0.0.0.0:59125 (CTRL + C to quit)
We’ve highlighted the section of the output showing us where to point our web browser.
The image above shows the 22 second clip took only 1.728 seconds to be generated. You can accelerate the processing if you have a GPU that supports CUDA.
We can listen to the output or download it as a WAV file. Over 25 languages are available including English (US and UK), German, Spanish, Italian, Dutch, and Chinese.
Here are a couple of WAV examples (US and UK).
The software supports speech synthesis markup language, an XML-based markup language for assisting the generation of synthetic speech in Web and other applications. This lets you insert pauses, change the volume, speaking rate, and voice.
What else does the software offer?
- Custom word pronunciations.
- Over 100 pre-trained voiced.
- Run multi-speaker models
If you want a text to speech engine that works entirely offline on inexpensive hardware (such as the Raspberry Pi), Mimic 3 gets our recommendation.
We’ve only shown the software running as a web server. It’s also possible to use Mimic 3 from the command line or in a screen-reader.