Machine Learning in Linux: Whisper – automatic speech recognition system

March 6, 2023 Steve Emms CLI, Reviews, Scientific, Software

Last Updated on December 19, 2023

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. Powered by deep learning and neural networks, Whisper is a natural language processing system that’s built on PyTorch.

The software offers transcription in multiple languages, as well as translation from those languages into English.

This is free and open source software.

Installation

I’ve updated this section

We tested Whisper originally with Ubuntu 22.04 LTS (as we ran into issues using Ubuntu 22.10), as well as more recently Ubuntu 23.10.

To avoid polluting your system, we recommend installing Whisper with Anaconda or Miniconda (if you only want conda).

Download and install Anaconda using wget.

$ wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh

Run the shell script:

$ bash Anaconda3-2023.09-0-Linux-x86_64.sh

You’ll be asked to accept Anaconda’s license and whether to initialize Anaconda3 by running conda init. For changes to take effect, close and re-open your current shell.

Create a conda environment, and activate it.

$ conda create --name whisper
$ conda activate whisper

Now we’re ready to install Whisper using pipx.

$ pipx install openai-whisper

Next page: Page 2 – In Operation and Summary

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Pages: 1 2

This site uses Akismet to reduce spam. Please read our Comment FAQ before posting.

11 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Tom J.

2 years ago

Whisper is Amazing! I haven’t tried the API for C++ yet but hopefully there’s finally hope for Linux speech recognition!

Gaf the Horse

1 year ago

This is a really useful tutorial on installing and setting up Whisper, so many times tutorials have errors leading to frustration but this one guided me through without a hitch….Many Thanks

Dick D.

1 year ago

No matter the input file size, I got: untyped_storage = torch.UntypedStorage(
torch.cuda.OutOfMemoryError: CUDA out of memory…. Very frustrating.

Author

Steve Emms

1 year ago

Reply to Dick D.

What graphics card are you using? Try using the tiny model to start with.

Mali

1 year ago

It didn’t work. How do I uninstall all those massive packages?

triple5

1 year ago

Reply to Mali

possibly with ubuntu 23.04 it doesn’t work? in my case it’s like this, and it is mentioned that at the top (as we ran into issues using Ubuntu 22.10).
It worked on my laptop with ubuntu 22.04,
on my laptop I used anaconda3 https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh instead of the above older version

If you want to uninstall, just remove the directory:
$ rm $HOME/anaconda3

Aaron

1 year ago

Whisper is awesome.

Dave

1 year ago

Long message saying it didn’t work. suggested using pipx after installing pipx tried command but it didn’t like the -U flag. Currently says it is installing without that. Seems to go on for ever, no change in size of Anaconda folder. xubuntu23.10

Author

Steve Emms

1 year ago

Reply to Dave

I’ve updated the installation section for Ubuntu 23.10. The old way of install via pip -U was deprecated after publishing my review.

Wilbur Ince

1 year ago

This worked great! I was able to get this running on Ubuntu 22.04, but had to remove pipx, and reinstall. I did this:
rm -rf ~/.local/pipx
Then just reinstall pipx, and it worked.

Is there a way to clean up the text output of Whisper?

Wilbur Ince

1 year ago

I figured it out! I am using this command to output to a txt file.
whisper intro.wav --model medium --language English --output_format txt

Documents	Internet	Education
Audio	Video	Graphics
Admin	Desktop	Productivity
Science	Games	Security
Utilities	Coding	Finance
Web Apps	Other	Books

Google	Microsoft	Apple
Adobe	IBM	Autodesk
Oracle	Atlassian	Corel
Cisco	Intuit	SAS
Progress	Salesforce	Citrix