A quite efficient compression algorithm: Blosc
Recently I just learnt about a new compression algorithm Blosc (https://www.blosc.org/trac). It has amazing performance, but with poor compression ratio (by design).
I have done a performance comparison between the Blosc and the .Net built in Deflate compression algorithm.
Here is my PC environment:
- CPU: Intel Xeon CPU E5-1620 3.6 GHz
- RAM: 16.0 GB
For Blosc, I use the following settings:
Number of threads: 1
- Set as single thread for fair comparison. Using multi-threads will improve the throughput a lot.
Compression level: 9
- It is the desired compression level and must be a number between 0 (no
compression) and 9 (maximum compression).
- It is the desired compression level and must be a number between 0 (no
Whether do shuffle: No
Type size: 8
DataSet |
Blosc |
Compression Throughput (MB/S) |
Compression Ratio |
Deflate |
Compression Throughput (MB/S) |
Compression Ratio |
[advwks_cust.dat] |
|
220.3 |
2.8 |
|
33.7 |
4.4 |
[advwks_fact.dat] |
|
451.1 |
4.9 |
|
58.4 |
14.5 |
[experian_fact.dat] |
|
285.0 |
3.0 |
|
26.7 |
6.2 |
[jitb_fact.dat] |
|
290.7 |
2.2 |
|
16.9 |
3.9 |
[mssales_fact.dat] |
|
501.9 |
6.0 |
|
41.7 |
10.4 |
[mssales_prod.dat] |
|
530.0 |
5.8 |
|
50.5 |
10.1 |
[skype_fact.dat] |
|
213.0 |
2.1 |
|
15.9 |
4.0 |
[synthetic_int.dat] |
|
359.4 |
3.9 |
|
15.7 |
5.7 |
Comments
Anonymous
June 05, 2014
An interesting property of the Blosc algorithm is that it supports a built-in multithread compression. However, its exposed compression API is not thread safe, which is quite different from the existing other compression algorithms (there is no built-in multithread, but the API is thread safe). This may make it harder to integrate with your existing framework if using multi-thread compression.Anonymous
December 11, 2014
Adding multithreading to compression libraries is a trivial task. Blosc is no more than Shuffle+Compression. You can add shuffle+Multithreading to any compression library.