So I was asked to look at compression on log servers and to see if
changing to xz would save us some space. My test is not comprehensive
but showed what might happen.
Basic summary. XZ may save us up to 2% over what we are currently
saving but its real advantage is in speed of uncompressing files over
bzip2. [compression may be faster for some files also.]
File | Size | Gzip | G% | Bunzip2 | B% | XZ | X%
messages.log | 644568 | 10992 | 98.3 | 4856 | 99.3 | 5940 | 99.1
mail.log | 610816 | 65060 | 89.3 | 40836 | 93.3 | 35536 | 94.5
TOTAL | 1255384 | 76052 | 93.5 | 45692 | 96.1 | 41476 | 96.5
Program | Compression Time | Uncompression Time
GZIP | 00m43.416s | 00m10.033s
BZIP | 10m42.296s | 01m02.525s
XZ | 10m15.937s | 00m12.565s
Raw data below
root@log01 smooge-b]# du -s messages.log mail.log
644568 messages.log
610816 mail.log
[root@log01 smooge-b]# time gzip -v -9 messages.log mail.log
messages.log: 98.3% -- replaced with messages.log.gz
mail.log: 89.3% -- replaced with mail.log.gz
real 0m43.416s
user 0m41.335s
sys 0m1.736s
[root@log01 smooge-b]# du -s messages.log.gz mail.log.gz
10992 messages.log.gz
65060 mail.log.gz
[root@log01 smooge-b]# time gunzip -v messages.log.gz mail.log.gz
messages.log.gz: 98.3% -- replaced with messages.log
mail.log.gz: 89.3% -- replaced with mail.log
real 0m10.033s
user 0m6.948s
sys 0m3.004s
[root@log01 smooge-b]# time bzip2 -v -9 messages.log mail.log
messages.log: 133.043:1, 0.060 bits/byte, 99.25% saved, 659381328
in, 4956148 out.
mail.log: 14.961:1, 0.535 bits/byte, 93.32% saved, 624854215
in, 41766136 out.
real 10m42.296s
user 10m36.948s
sys 0m1.608s
[root@log01 smooge-b]# du -sc messages.log.bz2 mail.log.bz2
4856 messages.log.bz2
40836 mail.log.bz2
45692 total
[root@log01 smooge-b]# time bunzip2 -v messages.log.bz2 mail.log.bz2
messages.log.bz2: done
mail.log.bz2: done
real 1m2.525s
user 0m44.779s
sys 0m4.956s
[root@log01 smooge-b]# time xz -v -9 messages.log mail.log
messages.log (1/2)
100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 3.1 MiB/s 3:21
mail.log (2/2)
100.0 % 34.7 MiB / 595.9 MiB = 0.058 1.4 MiB/s 6:53
real 10m15.937s
user 10m8.550s
sys 0m3.552s
[root@log01 smooge-b]# du -s messages.log.xz mail.log.xz
5940 messages.log.xz
35536 mail.log.xz
[root@log01 smooge-b]# time unxz -v messages.log.xz mail.log.xz
messages.log.xz (1/2)
100.0 % 5,923.6 KiB / 628.8 MiB = 0.009 140 MiB/s 0:04
mail.log.xz (2/2)
100.0 % 34.7 MiB / 595.9 MiB = 0.058 74 MiB/s 0:08
real 0m12.565s
user 0m8.709s
sys 0m3.636s
--
Stephen J Smoogen.
“The core skill of innovators is error recovery, not failure avoidance.”
Randy Nelson, President of Pixar University.
"We have a strategic plan. It's called doing things.""
— Herb Kelleher, founder Southwest Airlines