Musical concept image

Machine Learning in Linux: Spleeter – source separation library

In Operation

The models available are:

  • Vocals (singing voice) / accompaniment separation (2 stems).
  • Vocals / drums / bass / other separation (4 stems).
  • Vocals / drums / bass / piano / other separation (5 stems).

Spleeter is a fairly complex engine that’s easy to use. The actual separation needs a single command line.

Usage: spleeter [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Return Spleeter version
  --help     Show this message and exit.

Commands:
  evaluate  Evaluate a model on the musDB test dataset
  separate  Separate audio file(s)
  train     Train a source separation model

Here are a few examples:

By default, spleeter creates 2 stems. Perfect for karaoke!

$ spleeter separate test-music-file.flac -o /output/path

This command creates a folder called test-music-file with 2 stems: vocals.wav and accompaniment.

Let’s say we want 4 stems (vocals, drums, bass and other). Issue the command

$ spleeter separate test-music-file.flac -p spleeter:4stems -o /output/path

Let’s say we want 5 stems (vocals, drums, bass, piano and other). Issue the command

$ spleeter separate test-music-file.flac -p spleeter:5stems -o /output/path

The first time a model is used, the software will automatically download it before performing the separation.

The software can create wav, mp3, ogg, m4a, wma, and flac formats (use the -c flag). It supports tensorflow and librosa. Librosa is faster than tensorflow on CPU and uses less memory. If GPU acceleration is not available librosa is used by default.

The released models were trained on spectrograms up to 11kHz. But there are several ways of performing separation up to 16kHz or even 22kHz.

spleeter separate test-music-file.flac -c spleeter:4stems-16kHz -o /output/path

When you use the CLI, each time you run the spleeter command it will load the model again with an overhead. To avoid this overhead, it’s best to separate with a single call to the CLI utility.

Summary

Spleeter is designed to help the research community in Music Information Retrieval (MIR) leverage the power of a state-of-the-art source separation algorithm.

Spleeter makes it easy to train source separation model using a dataset of isolated sources. The project also supplies already trained state of the art models for performing various types of separation.

Try as hard as we could, we couldn’t coax Spleeter to use our GPU under Ubuntu 22.10 or 23.04. According to the project you need a fully working CUDA. Other machine learning projects we’ve evaluated had no issues whatsoever with our CUDA installation, so it’s not clear what’s wrong. We even tried a fresh installation of Ubuntu 22.04 and used our best endeavours to ensure our CUDA installation was flawless. But again no GPU usage. However, this didn’t stop as testing the software albeit slower as processing was bound to the CPU.

Website: research.deezer.com
Support: GitHub Code Repository
Developer: Deezer SA.
License: MIT License

Spleeter is written in Python. Learn Python with our recommended free books and free tutorials.

Artificial intelligence icon For other useful open source apps that use machine learning/deep learning, we’ve compiled this roundup.

Pages in this article:
Page 1 – Introduction and Installation
Page 2 – In Operation and Summary

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jumping Jeremy
Jumping Jeremy
11 months ago

I can’t get my GPU working either with Spleeter although my 13th gen processor is still pretty speedy at processing.

It’s great for creating karaoke tracks.