Data compression is the process of storing data in a format that uses less space than the original representation would use. Compressing data can be very useful particularly in the field of communications as it enables devices to transmit or store data in fewer bits. Besides reducing transmission bandwidth, compression increases the amount of information that can be stored on a hard disk drive or other storage device.
There are two main types of compression. Lossy compression is a data encoding method which reduces a file by discarding certain information. When the file is uncompressed, not all of the original information will be recovered. Lossy compression is typically used to compress video, audio and images, as well as internet telephony. The fact that information is lost during compression will often be unnoticeable to most users. Lossy compression techniques are used in all DVDs, Blu-ray discs, and most multimedia available on the internet.
However, lossy compression is unsuitable where the original and the decompression data must be identical. In this situation, the user will need to use lossless compression. This type of compression is employed in compressing software applications, files, and text articles. Loseless compression is also popular in archiving music. This article focuses on lossless compression tools.
Popular lossless compression tools include gzip, bzip2, and xz. When compressing and decompressing files these tools use a single core. But these days, most people run machines with multi-core processors. You won’t see the speed advantage modern processors offer with the traditional tools. Step forward modern compression tools that use all the cores present on your system when compressing files, offering massive speed advantages.
Some of the tools covered in this article don’t provide significant acceleration when decompressing compressed files. The ones that do offer significant improvement, using multiple cores, when decompressing files are pbzip2, lbzip2, plzip, and lrzip.
Let’s check out the multi-core compression tools. See our time and size charts. And at the end of each page, there’s a table with links to a dedicated page for each of the multi-core tools setting out, in detail, their respective features.
Next page: Page 2 – Charts with Default Compression
Pages in this article:
Page 1 – Introduction
Page 2 – Charts with Default Compression
Page 3 – Charts with Fastest Compression
Page 4 – Charts with Best Compression
Page 5 – lrzip with Different Compression Methods
Learn more about the features offered by the multi-core compression tool. We’ve compiled a dedicated page for each tool explaining, in detail, the features they offer.
|Multi-Core Compression Tools|
|pigz||Parallel implementation of gzip. It's a fully functional replacement for gzip|
|PBZIP2||Parallel implementation of the bzip2 block-sorting file compressor|
|PXZ||Runs LZMA compression on multiple cores and processors|
|lbzip2||Parallel bzip2 compression utility, suited for serial and parallel processing|
|plzip||Massively parallel (multi-threaded) lossless data compressor based on lzlib|
|lrzip||Compression utility that excels at compressing large files|
|pixz||Parallel indexing XZ compression, fully compatible with XZ. LZMA and LZMA2|
|Read our complete collection of recommended free and open source software. Our curated compilation covers all categories of software.
The software collection forms part of our series of informative articles for Linux enthusiasts. There are hundreds of in-depth reviews, open source alternatives to proprietary software from large corporations like Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.
There are also fun things to try, hardware, free programming books and tutorials, and much more.
Thank you so much! I’m going to try out some of these.
Some comparison between them would be very usefull
I did a similar study a few years ago and ended up using pbzip2 as my go-to compression utility.
The main reason is that it can do multi-core de-compression as well, unlike pigz.
The compression algorithm is fairly slow, so it works best when you have 30+ cores to throw at it.
Keep in mind that to use pbzip2 to de-compress with multiple cores, you need to compress with pbzip2 first. It adds some hints to the file to let the decompression know how to split up the work properly.