On Sat, Sep 11, 2021 at 7:14 AM Zbigniew Jędrzejewski-Szmek
<zbyszek(a)in.waw.pl> wrote:
On Tue, May 04, 2021 at 12:46:33PM +0100, Richard W.M. Jones wrote:
> On Mon, May 03, 2021 at 08:51:15PM -0000, Reon Beon via devel wrote:
> > LINUX KERNEL --
> > Adding to the variety of places where the Linux kernel supports making use of
Zstd compression, kernel modules moving forward can now enjoy size reductions with Zstd.
>
> Is this email a proposed Fedora change? Might be best to follow the
> process:
>
>
https://docs.fedoraproject.org/en-US/program_management/changes_policy/
>
> This change, while probably welcome, isn't entirely confined to the
> kernel package. Various other packages consume/create kernel modules
> and so will be affected. (In my case, supermin will require small
> changes to cope.) So it should be submitted as a system-wide change
> IMHO.
I decided to take a stab at this. Draft PR:
https://gitlab.com/cki-project/kernel-ark/-/merge_requests/1379
I'll paste the commit message, it has a bunch of measurements:
rpmspec: switch to zstd compression for modules
The main advantage is that zstd decompression is a lot faster and requires less
memory than lzma. This means that we need less CPU when loading various modules
and boot is a bit snappier.
All tests done with zstd-1.5.0-1.fc33.x86_64.
kmod >= 28 supports zstd, and 29 is present in all Fedora releases, so no
dependency
needs to be specified.
The compression is reduced a bit, so we pay some price in terms of disk space:
$ du -c **/*.ko|tail -n1 290148 total
$ du -c **/*.ko.xz|tail -n1 64324 total
$ du -c **/*.ko.zst|tail -n1 82560 total # default compression
$ du -c **/*.ko.zst|tail -n1 71152 total # with -19
Decompression of a bunch of drivers (serial, 133 files, from cache, no disk io):
$ time xzcat drivers/net/wireless/**/*.xz >/dev/null
xzcat drivers/net/wireless/**/*.xz > /dev/null 8.46s user 0.07s system 99% cpu
8.562 total
$ time zstdcat drivers/net/wireless/**/*.zst >/dev/null
zstdcat drivers/net/wireless/**/*.zst > /dev/null 1.33s user 0.05s system 99% cpu
1.396 total
(Max RSS is about 2×higher with xz).
So I'd say that we can expect some speedup when booting a machine, but not
something
that would blow people away.
Benchmarking of compression with a subset of files (drivers/net/wireless only):
$ for i in {1..20}; do rm **/*.ko.zst; time parallel zstd -q -$i :::
drivers/net/wireless/**/*.ko; du -c **/*.ko.zst|tail -n1; done
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 3.80s user 1.21s system 710%
cpu 0.705 total
147392 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 4.04s user 1.28s system 724%
cpu 0.735 total
133768 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 6.30s user 1.45s system 838%
cpu 0.925 total
129656 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 9.67s user 1.76s system 914%
cpu 1.249 total
128636 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 13.58s user 2.10s system 957%
cpu 1.639 total
126944 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 13.44s user 2.00s system 929%
cpu 1.661 total
126056 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 15.59s user 1.92s system 920%
cpu 1.901 total
118472 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 17.79s user 1.84s system 955%
cpu 2.054 total
115300 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 22.84s user 2.24s system 960%
cpu 2.611 total
114384 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 25.50s user 2.99s system 988%
cpu 2.881 total
114060 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 27.75s user 3.86s system 993%
cpu 3.182 total
114016 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 35.74s user 3.93s system 961%
cpu 4.127 total
113828 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 62.49s user 3.05s system 1009%
cpu 6.489 total
114272 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 64.67s user 4.07s system 1009%
cpu 6.809 total
114160 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 81.46s user 4.88s system 1004%
cpu 8.594 total
114284 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 141.48s user 3.71s system
1014% cpu 14.316 total
111920 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 155.15s user 4.33s system
1017% cpu 15.681 total
110744 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 197.08s user 4.52s system
1014% cpu 19.880 total
101576 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 223.10s user 5.79s system
1018% cpu 22.475 total
101616 total
parallel zstd -q -$i ::: drivers/net/wireless/**/*.ko 223.40s user 5.84s system
1018% cpu 22.498 total
101616 total
(-20 is the same as -19, unless --ultra is used, which it wasn't here.)
$ for i in {0..9}; do rm **/*.ko.xz; time parallel xz -q -k -$i :::
drivers/net/wireless/**/*.ko; du -c **/*.ko.xz|tail -n1; done
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 34.50s user 1.05s system 992%
cpu 3.583 total
105124 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 46.20s user 1.58s system
1013% cpu 4.715 total
96704 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 58.17s user 2.05s system
1027% cpu 5.859 total
95236 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 71.78s user 2.88s system
1029% cpu 7.254 total
94872 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 205.66s user 3.77s system
1031% cpu 20.302 total
90012 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 248.38s user 5.69s system
1010% cpu 25.142 total
86724 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 262.77s user 5.44s system
1022% cpu 26.243 total
86416 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 265.86s user 7.62s system
1019% cpu 26.835 total
86400 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 268.05s user 10.74s system
1023% cpu 27.229 total
86400 total
parallel xz -q -k -$i ::: drivers/net/wireless/**/*.ko 267.98s user 10.28s system
1020% cpu 27.266 total
86400 total
Based on this, I selected -18 as the compression level.
I think it's still worth to pursue — we should see a speedup on
slow-cpu machines and/or machines with hardware that requires a bunch
of modules.
Koji scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=75506561
I plan to submit an F36 change for this, as requested. If you'd like
to be a co-owner, please let me know. In particular testing and benchmarking
would be great contributions. So far I only tested this by booting
a VM, though the change seems rather straightforward, so if it works there,
it should work everywhere.
(In the PR, there's also a second commit to use zstd for kernel rpm
compression, or in other words, to stop overriding the default.)
As discussed previously, there is more to this than just changing the
kernel. There are bugs open for kdump, and there were a couple of
other pieces which need to be addressed. The lack of zstd modules so
far is not about being unwilling to do the work to make it happen, it
is a matter that we are not quite ready for it. It is not falling off
of the radar.
Justin