Speech

Machine Learning in Linux: TTS – deep learning toolkit for Text-to-Speech

Our Machine Learning in Linux series focuses on apps that make it easy to experiment with machine learning.

Coqui TTS (TTS) is a library for advanced Text-to-Speech generation. It offers pretrained models in more than 1,100 different languages, together tools for training new models and improving existing models. There are also utilities for dataset analysis.

Installation

We’re testing TTS using Ubuntu 23.10 on our test machine which hosts an NVIDIA GeForce RTX 3060 Ti dedicated graphics card. CUDA 12.3 is being used. CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units

As we’ve explained in previous articles in this series, we don’t recommend using pip to install software unless it’s within a virtual environment. A good solution is to use a conda environment as it helps manage dependencies, isolate projects, and it’s language agnostic.

We’ll therefore use conda to install TTS. If your system is missing conda, install either Anaconda or Miniconda first. Once installed, we can then create our conda environment with the command.

Create our conda environment with the command.

conda create --name coqui-tts python=3.9

Activate that environment with the command:

$ conda activate coqui-tts

Install TTS with pip within this conda environment to avoid polluting our system.

$ pip install TTS

There’s also a Docker image available.

Next page: Page 2 – In Operation and Summary

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Donna M
Donna M
2 months ago

This looks interesting, but I wish developers would provide distro-specific packages.

Anon E Mouse
Anon E Mouse
2 months ago

This project has been shut down.