Compression: gzip vs bzip2 vs 7-zip
A trade-off between time and space
Today I had a look at the different options to compress files (in this case for backup purposes) on a Ubuntu system. The most common tools to compress files are gzip and bzip2. They have both been around for a long time, are available on most systems by default and are nicely integrated with other utilities like GNU tar (using its -z and -j options).
7-zip and the algorithm it uses (LZMA) is not that common on UNIX-like operating systems. It is well-known as a free alternative for WinZip on Windows systems and was started back in 1998. For Ubuntu p7zip – a port of 7-zip to POSIX – is available in universe (sudo apt-get install p7zip).
My test file was a MySQL dump with a size of 163 MB that contains mostly text. I was interested in the compressed file size and in the time it takes to compress and uncompress the file.
Here are the results:
| Compressor | Size | Ratio | Compression | Decompression |
|---|---|---|---|---|
| gzip | 89 MB | 54 % | 0m 13s | 0m 05s |
| bzip2 | 81 MB | 49 % | 1m 30s | 0m 20s |
| 7-zip | 61 MB | 37 % | 1m 48s | 0m 11s |
For the test I ran all tools with their default settings, i.e. without providing any special options.
Gzip is still a great tool and provides good compression without consuming a lot of computation power. Bzip2 is much slower and only provides slightly better compression. 7-zip consumes a bit more cycles than bzip2 but results in far smaller compressed files. Speed for decompression is even better for 7-zip than for bzip2.
So if time is important (think of on-the-fly compression) gzip is the tool of choice. If you don't care too much about processing speed and need very good compression have a look at 7-zip. The only advantage bzip2 has over 7-zip is that bzip2 is part of most default installations and is more common. Let's hope this will change in the future, especially integration with GNU tar would be great.
References