<< February 17, 2008 | Home | February 19, 2008 >>

Compression: gzip vs bzip2 vs 7-zip

A trade-off between time and space

Today I had a look at the different options to compress files (in this case for backup purposes) on a Ubuntu system. The most common tools to compress files are gzip and bzip2. They have both been around for a long time, are available on most systems by default and are nicely integrated with other utilities like GNU tar (using its -z and -j options).
7-zip and the algorithm it uses (LZMA) is not that common on UNIX-like operating systems. It is well-known as a free alternative for WinZip on Windows systems and was started back in 1998. For Ubuntu p7zip – a port of 7-zip to POSIX – is available in universe (sudo apt-get install p7zip).

My test file was a MySQL dump with a size of 163 MB that contains mostly text. I was interested in the compressed file size and in the time it takes to compress and uncompress the file.

Here are the results:

Compressor Size Ratio Compression Decompression
gzip 89 MB 54 % 0m 13s 0m 05s
bzip2 81 MB 49 % 1m 30s 0m 20s
7-zip 61 MB 37 % 1m 48s 0m 11s

For the test I ran all tools with their default settings, i.e. without providing any special options.

Gzip is still a great tool and provides good compression without consuming a lot of computation power. Bzip2 is much slower and only provides slightly better compression. 7-zip consumes a bit more cycles than bzip2 but results in far smaller compressed files. Speed for decompression is even better for 7-zip than for bzip2.

So if time is important (think of on-the-fly compression) gzip is the tool of choice. If you don't care too much about processing speed and need very good compression have a look at 7-zip. The only advantage bzip2 has over 7-zip is that bzip2 is part of most default installations and is more common. Let's hope this will change in the future, especially integration with GNU tar would be great.

References