Hi folks! Today I woke up and found
https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me
down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install
images are 703M in size. Here's a potted history of network install
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M)
Fedora 13: 208M
Fedora 17: 162M (last "old UI")
Fedora 18: 294M (first "new UI")
Fedora 23: 415M
Fedora 28: 583M
Fedora 33: 686M
Fedora 37: 665M
Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in
FC8. Even after the UI rewrite in F18, we were only at 294M. Now the
image is well over 2x as big and does...basically the same.
Why does this matter? Well, the images being large is moderately
annoying in itself just in terms of transfer times and so on. But more
importantly, AIUI at least, the entire installer environment is loaded
into RAM at startup - it kinda has to be, we don't have anywhere else
to put it. The bigger it is, the more RAM you need to install Fedora.
The size of the installer environment (for which the size of the
network install image is more or less a perfect proxy) is one of the
two key factors in this, the other being how much RAM DNF uses during
So, I did a bit of poking about into *what* is taking up all that
space. There's a variety of answers, but there's two major culprits:
2. yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer,
which is great) to examine the filesystems, but I ran a couple of test
builds to confirm these suspects, especially after the impact of
compression (it's hard to check the *compressed* size of things in the
installer environment directly).
I did a scratch build of lorax which does not pull in firmware
packages, and had openQA build a netinst using that lorax. It came out
at 489M - 214M smaller than current netinsts, a size we last managed in
Fedora 26. I did a scratch build of anaconda with its requirement of
yelp dropped (which would break help pages), and built a netinst with
that; it came out at 662M - 41M smaller than current images. I haven't
run a combined test yet, but it ought to come out around 448M, around
the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for
not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other
forums before, but 214M compressed is *really* obnoxious. We must be
able to do something to clean this up (further than it's already
cleaned up - this is *after* we dropped low-hanging fruit like
enterprise switch 'firmwares' and garbage like that; most of the
remaining size seems to be huge amounts of probably-very-similar
firmware files for AMD graphics adapters and Intel wireless adapters).
I know some folks were trying to work on this (there was talk that we
could drop quite a lot of files that would only be loaded by older
kernels no longer in Fedora); any news on how far along that effort is?
Other obvious things that take up a lot of space:
1. /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my
system shows it compresses to around 11M, though, so that's probably
all it adds up to after compression (the image is an xz-compressed
2. /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to
23M. We are, I think, basically stuck with this for mesa-dri-drivers ,
but does it have to be so *big*?
3. libicudata.so.71.1 - 30.4M, compresses to 7M. This is in the
webkitgtk dep chain but seems to still be pulled in without it, not
sure what else is requiring it.
4. /usr/share/locale - 112M in total (uncompressed, not sure how much
compressed) of translated strings from a ton of packages. No idea how
many of these are really *needed* in the installer environment. We can
maybe come up with a way to have lorax strip some, if we can come up
with a viable way to figure out which. Obviously-fairly-large ones are
from gnupg2 and libgweather4. I do recall we have some logic somewhere
to decide which languages have a certain level of translation in
anaconda; perhaps we could only include the strings for these
IRC: adamw | Twitter: adamw_ha