Machine Learning in Linux: Whisper – automatic speech recognition system

Last Updated on December 19, 2023

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Powered by deep learning and neural networks, Whisper is a natural language processing system that’s built on PyTorch.

The software offers transcription in multiple languages, as well as translation from those languages into English.

This is free and open source software.

Installation

I’ve updated this section

We tested Whisper originally with Ubuntu 22.04 LTS (as we ran into issues using Ubuntu 22.10), as well as more recently Ubuntu 23.10.

To avoid polluting your system, we recommend installing Whisper with Anaconda or Miniconda (if you only want conda).

Download and install Anaconda using wget.

$ wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh

Run the shell script:

$ bash Anaconda3-2023.09-0-Linux-x86_64.sh

You’ll be asked to accept Anaconda’s license and whether to initialize Anaconda3 by running conda init. For changes to take effect, close and re-open your current shell.

Create a conda environment, and activate it.

$ conda create --name whisper
$ conda activate whisper

Now we’re ready to install Whisper using pipx.

$ pipx install openai-whisper

Next page: Page 2 – In Operation and Summary

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Tom J.
Tom J.
11 months ago

Whisper is Amazing! I haven’t tried the API for C++ yet but hopefully there’s finally hope for Linux speech recognition!

Gaf the Horse
Gaf the Horse
7 months ago

This is a really useful tutorial on installing and setting up Whisper, so many times tutorials have errors leading to frustration but this one guided me through without a hitch….Many Thanks

Dick D.
Dick D.
7 months ago

No matter the input file size, I got: untyped_storage = torch.UntypedStorage(
torch.cuda.OutOfMemoryError: CUDA out of memory…. Very frustrating.

Mali
Mali
5 months ago

It didn’t work. How do I uninstall all those massive packages?

triple5
triple5
4 months ago
Reply to  Mali

possibly with ubuntu 23.04 it doesn’t work? in my case it’s like this, and it is mentioned that at the top (as we ran into issues using Ubuntu 22.10).
It worked on my laptop with ubuntu 22.04,
on my laptop I used anaconda3 https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh instead of the above older version

If you want to uninstall, just remove the directory:
$ rm $HOME/anaconda3

Aaron
Aaron
5 months ago

Whisper is awesome.

Dave
Dave
2 months ago

Long message saying it didn’t work. suggested using pipx after installing pipx tried command but it didn’t like the -U flag. Currently says it is installing without that. Seems to go on for ever, no change in size of Anaconda folder. xubuntu23.10

Wilbur Ince
Wilbur Ince
2 months ago

This worked great! I was able to get this running on Ubuntu 22.04, but had to remove pipx, and reinstall. I did this:
rm -rf ~/.local/pipx
Then just reinstall pipx, and it worked.

Is there a way to clean up the text output of Whisper?

Wilbur Ince
Wilbur Ince
2 months ago

I figured it out! I am using this command to output to a txt file.
whisper intro.wav --model medium --language English --output_format txt