Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
So, I did a bit of poking about into *what* is taking up all that space. There's a variety of answers, but there's two major culprits:
1. firmware 2. yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer, which is great) to examine the filesystems, but I ran a couple of test builds to confirm these suspects, especially after the impact of compression (it's hard to check the *compressed* size of things in the installer environment directly).
I did a scratch build of lorax which does not pull in firmware packages, and had openQA build a netinst using that lorax. It came out at 489M - 214M smaller than current netinsts, a size we last managed in Fedora 26. I did a scratch build of anaconda with its requirement of yelp dropped (which would break help pages), and built a netinst with that; it came out at 662M - 41M smaller than current images. I haven't run a combined test yet, but it ought to come out around 448M, around the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
Other obvious things that take up a lot of space:
1. /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is 224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
2. /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to 23M. We are, I think, basically stuck with this for mesa-dri-drivers , but does it have to be so *big*?
3. libicudata.so.71.1 - 30.4M, compresses to 7M. This is in the webkitgtk dep chain but seems to still be pulled in without it, not sure what else is requiring it.
4. /usr/share/locale - 112M in total (uncompressed, not sure how much compressed) of translated strings from a ton of packages. No idea how many of these are really *needed* in the installer environment. We can maybe come up with a way to have lorax strip some, if we can come up with a viable way to figure out which. Obviously-fairly-large ones are from gnupg2 and libgweather4. I do recall we have some logic somewhere to decide which languages have a certain level of translation in anaconda; perhaps we could only include the strings for these languages?
* Adam Williamson:
- /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
Isn't the compression block-based? I think it would be interesting to measure the image size with the file removed.
For the non-live installer, we can *significantly* cut down its size, without degrading localization of the installer itself.
- /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to
23M. We are, I think, basically stuck with this for mesa-dri-drivers , but does it have to be so *big*?
It has all the targets in it. As it's for JIT, we'd only need one target.
Thanks, Florian
On Thu, 2022-12-08 at 07:57 +0100, Florian Weimer wrote:
- Adam Williamson:
- /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
Isn't the compression block-based? I think it would be interesting to measure the image size with the file removed.
I'll try it tomorrow, it's not too hard.
For the non-live installer, we can *significantly* cut down its size, without degrading localization of the installer itself.
- /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to
23M. We are, I think, basically stuck with this for mesa-dri-drivers , but does it have to be so *big*?
It has all the targets in it. As it's for JIT, we'd only need one target.
That sounds interesting, though of course the details of how to implement it could be a bit tricky, I guess...
Thanks for the ideas!
* Adam Williamson:
On Thu, 2022-12-08 at 07:57 +0100, Florian Weimer wrote:
- Adam Williamson:
- /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
Isn't the compression block-based? I think it would be interesting to measure the image size with the file removed.
I'll try it tomorrow, it's not too hard.
Have you posted the outcome of the experiment somewhere?
Thanks, Florian
On Thu, Dec 8, 2022 at 12:42 AM Adam Williamson adamwill@fedoraproject.org wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
So, I did a bit of poking about into *what* is taking up all that space. There's a variety of answers, but there's two major culprits:
- firmware
- yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer, which is great) to examine the filesystems, but I ran a couple of test builds to confirm these suspects, especially after the impact of compression (it's hard to check the *compressed* size of things in the installer environment directly).
I did a scratch build of lorax which does not pull in firmware packages, and had openQA build a netinst using that lorax. It came out at 489M - 214M smaller than current netinsts, a size we last managed in Fedora 26. I did a scratch build of anaconda with its requirement of yelp dropped (which would break help pages), and built a netinst with that; it came out at 662M - 41M smaller than current images. I haven't run a combined test yet, but it ought to come out around 448M, around the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Peter
On Thu, 8 Dec 2022 at 08:15, Peter Robinson pbrobinson@gmail.com wrote:
On Thu, Dec 8, 2022 at 12:42 AM Adam Williamson adamwill@fedoraproject.org wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
So, I did a bit of poking about into *what* is taking up all that space. There's a variety of answers, but there's two major culprits:
- firmware
- yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer, which is great) to examine the filesystems, but I ran a couple of test builds to confirm these suspects, especially after the impact of compression (it's hard to check the *compressed* size of things in the installer environment directly).
I did a scratch build of lorax which does not pull in firmware packages, and had openQA build a netinst using that lorax. It came out at 489M - 214M smaller than current netinsts, a size we last managed in Fedora 26. I did a scratch build of anaconda with its requirement of yelp dropped (which would break help pages), and built a netinst with that; it came out at 662M - 41M smaller than current images. I haven't run a combined test yet, but it ought to come out around 448M, around the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
The only ideas I have seen which 'work'* is to ship a minimal set of drivers for some 'chosen' hardware and then you have a bloated kitchen-sink iso which has all the drivers in it. The chosen hardware could be a 'defined' virtual environment which needs a minimal set of firmware, languages, etc. Everything else can be done in the install for getting languages, extra firmware, etc. The kitchen-sink.iso is going to be one which we know is going to be large.
Now I have doubled the QA, releng, and product work.. I would say we would focus most of the work on the mini-installer, but we all know that all the work will be in the kitchen-sink one.
* for some small definition of 'work'
On Thu, 2022-12-08 at 08:26 -0500, Stephen Smoogen wrote:
The only ideas I have seen which 'work'* is to ship a minimal set of drivers for some 'chosen' hardware and then you have a bloated kitchen-sink iso which has all the drivers in it. The chosen hardware could be a 'defined' virtual environment which needs a minimal set of firmware, languages, etc. Everything else can be done in the install for getting languages, extra firmware, etc. The kitchen-sink.iso is going to be one which we know is going to be large.
Now I have doubled the QA, releng, and product work.. I would say we would focus most of the work on the mini-installer, but we all know that all the work will be in the kitchen-sink one.
Well, if the two images are just "with firmware" and "without firmware" I suppose in a way the testing load shouldn't be too awful, because we can be pretty confident of the possible behaviours - nothing except the kernel should do anything with kernel firmware files, after all, so you're kinda limited to "it works fine", "some hardware doesn't work because the firmware isn't there", and the occasional "kernel blew up because firmware was missing which it really shouldn't do", which is bad but at least easy to spot and workaround ("you gotta boot with the firmware, sorry"). We could probably rely mostly on automated testing to confirm that both images at least work properly in expected cases. Also it's general not too onerous for us to test "some random alternate path through the installer" because we have to run several install tests on bare metal anyway so we can just kinda add it to the 'matrix' there - "OK, for the firmware RAID install test I'll try booting the no-firmware route", that knocks out two tests in one. (Of course you're assuming that firmware RAID handling isn't somehow broken when booting with the firmware, but that seems sufficiently unlikely not to worry about :>)
If we want to get fancy, I suppose we could ship a single ISO, but with two filesystem images - one the main installer environment, one containing /lib/firmware - and just have a boot arg that (tells dracut to) mounts the firmware one, and a boot menu option to *not* do it (not pass the boot arg), which saves memory as long as the system doesn't need the firmwares. But then of course someone will say "hey, why don't we build a second ISO without the firmware image included at all, so we have a small ISO for people who know they don't need it?" and you're back at option 1. We also I guess have to think about how things work for things like PXE installs, and maybe update the documentation there...
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
On Thursday, December 8, 2022, Adam Williamson adamwill@fedoraproject.org wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that are a non issue for today's hardware.
Adam Williamson Fedora QA IRC: adamw | Twitter: adamw_ha https://www.happyassassin.net
desktop mailing list -- desktop@lists.fedoraproject.org To unsubscribe send an email to desktop-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject. org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/desktop@ lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora- infrastructure/new_issue
On Thu, 2022-12-08 at 19:59 +0100, drago01 wrote:
On Thursday, December 8, 2022, Adam Williamson adamwill@fedoraproject.org wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that are a non issue for today's hardware.
I mean, the modern systems that *need* GPU firmware generally seem to do pretty well with using native resolution and don't perform too badly, especially in the simple installer UI. When I test the fallback path on my bare metal test box on UEFI it uses the monitor's native resolution and performs fine (even, honestly, in GNOME), and that motherboard is nearly a decade old even. Don't know if this is the same for everyone, of course.
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
On Thursday, December 8, 2022, Adam Williamson adamwill@fedoraproject.org wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that are a non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point. Now most people probably don't run the installer in a public cloud, preferring pre-built disk images. Even in a local machine though, you may be using most of your 32 GB of RAM for other things (well firefox/chrome), so allowing extra for the VM is not without resource cost. If we could figure out a way to knock a few 100 MB off the installer RAM requirements that is valuable.
With regards, Daniel
On Thursday, December 8, 2022, Daniel P. Berrangé berrange@redhat.com wrote:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
On Thursday, December 8, 2022, Adam Williamson <
adamwill@fedoraproject.org>
wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of
multiple
copies of the same firmware. It's improved things a bit,
unfortunately
a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc
to
even just install so I don't know how we can get around this and
still
have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that are
a
non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point. Now most people probably don't run the installer in a public cloud, preferring pre-built disk images. Even in a local machine though, you may be using most of your 32 GB of RAM for other things (well firefox/chrome), so allowing extra for the VM is not without resource cost. If we could figure out a way to knock a few 100 MB off the installer RAM requirements that is valuable.
The problem I see here is not the presence of the firmware on the image, but the fact that it seems to be loaded into memory despite not being used.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/ dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/ dberrange :| _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject. org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@ lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora- infrastructure/new_issue
On Thu, 2022-12-08 at 20:23 +0100, drago01 wrote:
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point. Now most people probably don't run the installer in a public cloud, preferring pre-built disk images. Even in a local machine though, you may be using most of your 32 GB of RAM for other things (well firefox/chrome), so allowing extra for the VM is not without resource cost. If we could figure out a way to knock a few 100 MB off the installer RAM requirements that is valuable.
The problem I see here is not the presence of the firmware on the image, but the fact that it seems to be loaded into memory despite not being used.
This is the direction Daniel was thinking down. I'm waiting for someone with more expertise to reply, but I suspect the reply is going to be along the lines of "yes, we *can* do that, but it's somewhat tricky work that involves thinking about lots of paths that aren't obvious, and somebody would need to dedicate their time to working on that".
Once upon a time, Adam Williamson adamwill@fedoraproject.org said:
This is the direction Daniel was thinking down. I'm waiting for someone with more expertise to reply, but I suspect the reply is going to be along the lines of "yes, we *can* do that, but it's somewhat tricky work that involves thinking about lots of paths that aren't obvious, and somebody would need to dedicate their time to working on that".
One other thing that I noticed a while back that takes up a chunk of space is the kernel... it's included inside install.img (in two places even, although I assume it's hardlinked?), even though it has to have already been loaded before install.img can be read.
On Thu, Dec 08, 2022 at 02:17:22PM -0600, Chris Adams wrote:
Once upon a time, Adam Williamson adamwill@fedoraproject.org said:
This is the direction Daniel was thinking down. I'm waiting for someone with more expertise to reply, but I suspect the reply is going to be along the lines of "yes, we *can* do that, but it's somewhat tricky work that involves thinking about lots of paths that aren't obvious, and somebody would need to dedicate their time to working on that".
One other thing that I noticed a while back that takes up a chunk of space is the kernel... it's included inside install.img (in two places even, although I assume it's hardlinked?), even though it has to have already been loaded before install.img can be read.
What two places? On rawhide we have one on the iso under /images/pxeboot/ and another inside the install.img under /usr/lib/modules...
I think I tried removing the kernel from the install.img at one point, but it ended up being required for FIPS (see https://github.com/weldr/lorax/issues/1021). And when there were 2 kernels on the iso they were hardlinked (one under pxeboot and the other under isolinux).
Brian
Once upon a time, Brian C. Lane bcl@redhat.com said:
On Thu, Dec 08, 2022 at 02:17:22PM -0600, Chris Adams wrote:
One other thing that I noticed a while back that takes up a chunk of space is the kernel... it's included inside install.img (in two places even, although I assume it's hardlinked?), even though it has to have already been loaded before install.img can be read.
What two places? On rawhide we have one on the iso under /images/pxeboot/ and another inside the install.img under /usr/lib/modules...
There used to be two on the ISO with syslinux (but they were effectively hardlinked, so that didn't matter), didn't realize that'd been reduced.
There's also two inside install.img, /boot/vmlinux-<version> and /usr/lib/modules/<version>/vmlinuz. This is what I'm not sure if it's linked, or if the squashfs compression makes it an effective wash, or what; extracting the squashfs does not result in hardlinked files.
I know that /boot on an installed system is typically separate, so hard links don't work when installing the kernel, but separate-/boot is not always the case. It'd be nice if the kernel install scripts tried to hard link where possible.
I think I tried removing the kernel from the install.img at one point, but it ended up being required for FIPS (see https://github.com/weldr/lorax/issues/1021).
Ahh. I think I brought it up (on fedora-devel or maybe anaconda-devel) too, but forgot (and forgot the reason why). I agree that it seems silly that looking at a kernel file that was not used to boot is considered acceptable.
On Fri, Dec 09, 2022 at 02:38:48PM -0600, Chris Adams wrote:
Once upon a time, Brian C. Lane bcl@redhat.com said:
On Thu, Dec 08, 2022 at 02:17:22PM -0600, Chris Adams wrote:
One other thing that I noticed a while back that takes up a chunk of space is the kernel... it's included inside install.img (in two places even, although I assume it's hardlinked?), even though it has to have already been loaded before install.img can be read.
What two places? On rawhide we have one on the iso under /images/pxeboot/ and another inside the install.img under /usr/lib/modules...
There used to be two on the ISO with syslinux (but they were effectively hardlinked, so that didn't matter), didn't realize that'd been reduced.
There's also two inside install.img, /boot/vmlinux-<version> and /usr/lib/modules/<version>/vmlinuz. This is what I'm not sure if it's linked, or if the squashfs compression makes it an effective wash, or what; extracting the squashfs does not result in hardlinked files.
I checked, it's not hardlinked. I'd hope that squashfs does the right thing and takes advantage of them being the same.
Brian
On Thu, Dec 08, 2022 at 11:49:16AM -0800, Adam Williamson wrote:
On Thu, 2022-12-08 at 20:23 +0100, drago01 wrote:
The problem I see here is not the presence of the firmware on the image, but the fact that it seems to be loaded into memory despite not being used.
This is the direction Daniel was thinking down. I'm waiting for someone with more expertise to reply, but I suspect the reply is going to be along the lines of "yes, we *can* do that, but it's somewhat tricky work that involves thinking about lots of paths that aren't obvious, and somebody would need to dedicate their time to working on that".
Split install.img into install.img + firmware.img? I think we already have support for multiple images (I see requests for updates.img when watching httpd logs while doing network installs), so the split should be easy. The somewhat more tricky part is probably to figure whenever we need the firmware or not.
take care, Gerd
Hi,
On Thu, Dec 8, 2022 at 2:55 PM Adam Williamson adamwill@fedoraproject.org wrote:
This is the direction Daniel was thinking down. I'm waiting for someone with more expertise to reply, but I suspect the reply is going to be along the lines of "yes, we *can* do that, but it's somewhat tricky work that involves thinking about lots of paths that aren't obvious, and somebody would need to dedicate their time to working on that".
Presumably we could package the firmware separately and just unpack it into place from a udev rule when the hardware is detected?
But first, do we actually know this is a problem? I think you're saying squashfs loads the whole decompressed image into memory, but my expectation prior to your mail was that it performs I/O on the usb stick (with a cache in between). If my intuition was right and files only hit ram when accessed, then it seems like this is pretty much not an issue, right?
Do you have stats on memory usage when running in a live environment?
Once upon a time, Daniel P. Berrangé berrange@redhat.com said:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that are a non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point.
Also "today's hardware" increasingly includes small devices like Raspberry Pi. ARM devices don't typically use anaconda, but there are also small x86 based devices competing with the small ARM devices.
I think the answer is "no", but I'll ask anyway: is there a way to evict all the firmware once the system is started? I'm guessing that as long as it's all in one disk image, that's not possible. Can we tack on a second disk image with use-once (at most) stuff and then drop the whole image after startup? That wouldn't help the initial RAM usage, but it would free up a chunk (so that dnf can then use it).
That wouldn't help with the other large space users, but it would probably make the firmware much less of an issue.
On Thursday, December 8, 2022, Chris Adams linux@cmadams.net wrote:
Once upon a time, Daniel P. Berrangé berrange@redhat.com said:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that
are a
non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point.
Also "today's hardware" increasingly includes small devices like Raspberry Pi. ARM devices don't typically use anaconda, but there are also small x86 based devices competing with the small ARM devices.
I think the answer is "no", but I'll ask anyway: is there a way to evict all the firmware once the system is started? I'm guessing that as long as it's all in one disk image, that's not possible. Can we tack on a second disk image with use-once (at most) stuff and then drop the whole image after startup?
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
Once upon a time, drago01 drago01@gmail.com said:
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
That only works for cases where there is local install media. Network installs require downloading and image and running it from RAM.
On Thu, Dec 08, 2022 at 02:12:22PM -0600, Chris Adams wrote:
Once upon a time, drago01 drago01@gmail.com said:
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
That only works for cases where there is local install media. Network installs require downloading and image and running it from RAM.
That's not really true as long as the web server supports random access and/or you use NBD or NFS root.
Rich.
On Thu, 2022-12-08 at 20:43 +0100, drago01 wrote:
On Thursday, December 8, 2022, Chris Adams linux@cmadams.net wrote:
Once upon a time, Daniel P. Berrangé berrange@redhat.com said:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that
are a
non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point.
Also "today's hardware" increasingly includes small devices like Raspberry Pi. ARM devices don't typically use anaconda, but there are also small x86 based devices competing with the small ARM devices.
I think the answer is "no", but I'll ask anyway: is there a way to evict all the firmware once the system is started? I'm guessing that as long as it's all in one disk image, that's not possible. Can we tack on a second disk image with use-once (at most) stuff and then drop the whole image after startup?
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
The problem is, what is "the install media"? We don't *only* support installs from USB sticks and DVDs - things the installer could potentially access as local storage after starting up. We also do installs where everything is retrieved over the network - PXE installs, for instance.
There are possible ways to finesse things even in those cases - as I said, Daniel started thinking them through a bit - but it's not as simple as just "put this stuff on the ISO and read it off that".
On Thu, Dec 08, 2022 at 12:54:16PM -0800, Adam Williamson wrote:
On Thu, 2022-12-08 at 20:43 +0100, drago01 wrote:
On Thursday, December 8, 2022, Chris Adams linux@cmadams.net wrote:
Once upon a time, Daniel P. Berrangé berrange@redhat.com said:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that
are a
non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point.
Also "today's hardware" increasingly includes small devices like Raspberry Pi. ARM devices don't typically use anaconda, but there are also small x86 based devices competing with the small ARM devices.
I think the answer is "no", but I'll ask anyway: is there a way to evict all the firmware once the system is started? I'm guessing that as long as it's all in one disk image, that's not possible. Can we tack on a second disk image with use-once (at most) stuff and then drop the whole image after startup?
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
The problem is, what is "the install media"? We don't *only* support installs from USB sticks and DVDs - things the installer could potentially access as local storage after starting up. We also do installs where everything is retrieved over the network - PXE installs, for instance.
There are possible ways to finesse things even in those cases - as I said, Daniel started thinking them through a bit - but it's not as simple as just "put this stuff on the ISO and read it off that".
It could potentially be almost that simple actually
qemu-nbd -c https:///some.server/path/to/second.iso mount /dev/nbd0 /mnt/second-iso
This uses QEMU's curl driver, which will fetch blocks of the ISO content only as they are accessed, so you're not pulling down the whole ISO if you only read 2 files from it.
The 'nbdkit' program can be used instead of qemu-nbd, and probably a better choice since it can layer into all sorts of interesting functionality that QEMU's curl layer can't offer.
With regards, Daniel
On Fri, Dec 09, 2022 at 08:09:42AM +0000, Daniel P. Berrangé wrote:
On Thu, Dec 08, 2022 at 12:54:16PM -0800, Adam Williamson wrote:
On Thu, 2022-12-08 at 20:43 +0100, drago01 wrote:
On Thursday, December 8, 2022, Chris Adams linux@cmadams.net wrote:
Once upon a time, Daniel P. Berrangé berrange@redhat.com said:
On Thu, Dec 08, 2022 at 07:59:20PM +0100, drago01 wrote:
That would be very crazy, as you will have a degraded user experience (laggy UI, wrong resolution, ...) to save a couple of megabytes that
are a
non issue for today's hardware.
Please bear in mind the difference between bare metal and virtual machines. The bare metal machine may have 32 GB of RAM, making a 800 MB install image a non-issue. For a public cloud virtual machine though, this could bump your VM sizing up 1 level from 2 GB quota to a 4 GB RAM quota, with correspondingly higher price point.
Also "today's hardware" increasingly includes small devices like Raspberry Pi. ARM devices don't typically use anaconda, but there are also small x86 based devices competing with the small ARM devices.
I think the answer is "no", but I'll ask anyway: is there a way to evict all the firmware once the system is started? I'm guessing that as long as it's all in one disk image, that's not possible. Can we tack on a second disk image with use-once (at most) stuff and then drop the whole image after startup?
Again there is no reason why everything on the disk image had to be loaded into memory in the first place. Same way when you boot your installed system, not everything on disk is loaded into memory. If you don't need the firmware, it should stay on the install media and never be loaded into memory.
The problem is, what is "the install media"? We don't *only* support installs from USB sticks and DVDs - things the installer could potentially access as local storage after starting up. We also do installs where everything is retrieved over the network - PXE installs, for instance.
There are possible ways to finesse things even in those cases - as I said, Daniel started thinking them through a bit - but it's not as simple as just "put this stuff on the ISO and read it off that".
It could potentially be almost that simple actually
qemu-nbd -c https:///some.server/path/to/second.iso mount /dev/nbd0 /mnt/second-iso
This uses QEMU's curl driver, which will fetch blocks of the ISO content only as they are accessed, so you're not pulling down the whole ISO if you only read 2 files from it.
The 'nbdkit' program can be used instead of qemu-nbd, and probably a better choice since it can layer into all sorts of interesting functionality that QEMU's curl layer can't offer.
This, but using kernel nbd root instead of a qemu nbd file:
https://rwmj.wordpress.com/2019/02/19/nbdkit-linuxdisk-plugin/
Rich.
On Thu, Dec 8, 2022 at 4:56 PM Adam Williamson adamwill@fedoraproject.org wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That has crossed my mind, and with simpledrm that may be more straight forward now, but TBH it's not something I am skilled enough to deal with, nor have the resources to test, or actually care enough about, but the big GPU firmwares are now all split out so that should be much more straightforward for someone with the resources to investigate.
On Fri, 2022-12-09 at 11:12 +0000, Peter Robinson wrote:
On Thu, Dec 8, 2022 at 4:56 PM Adam Williamson adamwill@fedoraproject.org wrote:
On Thu, 2022-12-08 at 12:58 +0000, Peter Robinson wrote:
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Sorry if this is way off, but - do we need the GPU firmwares to run a graphical install on the fallback path, just using the framebuffer set up by the firmware? How crazy would it be to just do that - ship the installer env with no GPU firmware?
That has crossed my mind, and with simpledrm that may be more straight forward now, but TBH it's not something I am skilled enough to deal with, nor have the resources to test, or actually care enough about, but the big GPU firmwares are now all split out so that should be much more straightforward for someone with the resources to investigate.
Heck, if people want to try it out, we can. I can re-run the openQA test (the old one's assets will have been garbage collected by now) and pull the ISO out and upload it somewhere, and everyone can see how it behaves on their system. maybe I'll do that later...
Hi,
On 12/8/22 06:58, Peter Robinson wrote:
On Thu, Dec 8, 2022 at 12:42 AM Adam Williamson adamwill@fedoraproject.org wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
So, I did a bit of poking about into *what* is taking up all that space. There's a variety of answers, but there's two major culprits:
- firmware
- yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer, which is great) to examine the filesystems, but I ran a couple of test builds to confirm these suspects, especially after the impact of compression (it's hard to check the *compressed* size of things in the installer environment directly).
I did a scratch build of lorax which does not pull in firmware packages, and had openQA build a netinst using that lorax. It came out at 489M - 214M smaller than current netinsts, a size we last managed in Fedora 26. I did a scratch build of anaconda with its requirement of yelp dropped (which would break help pages), and built a netinst with that; it came out at 662M - 41M smaller than current images. I haven't run a combined test yet, but it ought to come out around 448M, around the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
I've done a few passes, dropping a bunch of older firmware upstream that are no longer supported in any stable kernel release, also a bunch of de-dupe and linking of files rather than shipping of multiple copies of the same firmware. It's improved things a bit, unfortunately a lot of the dead firmware was tiny compared to say average modern devices like GPUs or WiFI.
The problem with a lot of the firmware, and with the new nvidia "open driver" which shoves a lot of stuff into firmware in order to have an upstreamable driver apparently the firmwares there are going to be 30+Mb each, is that they're needed to bring up graphics/network etc to even just install so I don't know how we can get around this and still have a device work enough to be able to install the needed firmware across the network.
Ideas on how to solve that problem welcome.
Ok, I have a couple ideas, but they start with the question, why do we need fully accelerated graphics for an installer (live image excepted) that works nearly as well in text mode? That gets the GPU firmware off the install ramdisk.
Just being a bit more fine grained with the firmware package and only installing the pieces needed by the running machine shrinks could shrink the footprint too. Something that looks for kernel firmware load errors, and installs a package solves the issue of HW that has been dynamically added after the fact (of course disk/network card firmware would still be needed by the installer).
Although, just doing per arch firmware shrinks it too. Both the x86 and arm64 packages are both 177M, and it seems unlikely my arm machine needs amd microcode, or that my amd needs the dpaa firmware or firmware specific to some arm SBCs.
So, ideas, but then someone needs to spend the time fixing the problem.
Peter _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Wed, Dec 07, 2022 at 04:42:05PM -0800, Adam Williamson wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
snip
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
Is there something that can be done to optimize the RAM usage, in spite of the large installer env size ?
If we're installing off DVD media, it shouldn't be required to pull all of the content into RAM, since it can be fetched on demand from the media. IOW, 99% of the firmware never need leave the ISO, so shouldn't matter if firmware is GBs in size [1] if we never load it off the media. Same for languages, only the one we actually want to use should ever get into RAM.
If we're installing off a network source, we need to pull content into RAM, but that doesn't mean we should pull everything in at once upfront.
Is it possible to delay pulling in non-NIC firmware until we have a NIC configured, and just rely on the basic generic framebuffer setup by UEFI/BIOS until we get far enugh to pull in video card firmware ?
For localization, is it possible to split the localization into per-language bundles, and delay loading off the network until we know what language we want to load, instead of pre-loading all languages ?
With regards, Daniel
[1] Yes, I know it matters for user media download size in reality
On Wed, Dec 07, 2022 at 04:42:05PM -0800, Adam Williamson wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
I take issue with this. It is not accurate to say that the installer now does not do much more than it did for Fedora Core 8. There is more to the installer than the UI.
Broadly speaking, a lot of the growth came from converging the runtime environment for the installer with the installed system. In Fedora Core 8 and previous releases, the "installer environment" was a unique and stripped down install. This was frustrating because it was effectively maintaining a small mini distro for the purposes of running the distro installer.
Why does this matter? Well, the images being large is moderately annoying in itself just in terms of transfer times and so on. But more importantly, AIUI at least, the entire installer environment is loaded into RAM at startup - it kinda has to be, we don't have anywhere else to put it. The bigger it is, the more RAM you need to install Fedora. The size of the installer environment (for which the size of the network install image is more or less a perfect proxy) is one of the two key factors in this, the other being how much RAM DNF uses during package install.
So, I did a bit of poking about into *what* is taking up all that space. There's a variety of answers, but there's two major culprits:
- firmware
- yelp (which pulls in webkitgtk and its deps)
I've been using du and baobab (the GNOME visual disk usage analyzer, which is great) to examine the filesystems, but I ran a couple of test builds to confirm these suspects, especially after the impact of compression (it's hard to check the *compressed* size of things in the installer environment directly).
I did a scratch build of lorax which does not pull in firmware packages, and had openQA build a netinst using that lorax. It came out at 489M - 214M smaller than current netinsts, a size we last managed in Fedora 26. I did a scratch build of anaconda with its requirement of yelp dropped (which would break help pages), and built a netinst with that; it came out at 662M - 41M smaller than current images. I haven't run a combined test yet, but it ought to come out around 448M, around the size of Fedora 24.
Even then we'd still be about 50% larger than the Fedora 18 image, for not really any added functionality.
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
I think some curation on firmware could happen. I think probinson@ mentions it later in this thread. If there were a way to identify the firmware necessary for the installer environment that could probably simplify things.
Other obvious things that take up a lot of space:
- /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
Can this be installed compressed? I'm not sure it can.
I guess a more important question is whether or not this file is used at install time. That I do not know.
- /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to
23M. We are, I think, basically stuck with this for mesa-dri-drivers , but does it have to be so *big*?
If mesa-dri-drivers is not required for installation, it could be removed from the installer environment.
- libicudata.so.71.1 - 30.4M, compresses to 7M. This is in the
webkitgtk dep chain but seems to still be pulled in without it, not sure what else is requiring it.
Not sure. On my system I see 175 things in /usr/bin that report libicudata when you ldd the file. Mostly desktop related things. But then there's stuff like zenity which was historically included in the installer environment for people writing interactive %post scripts in kickstart (please don't do this).
- /usr/share/locale - 112M in total (uncompressed, not sure how much
compressed) of translated strings from a ton of packages. No idea how many of these are really *needed* in the installer environment. We can maybe come up with a way to have lorax strip some, if we can come up with a viable way to figure out which. Obviously-fairly-large ones are from gnupg2 and libgweather4. I do recall we have some logic somewhere to decide which languages have a certain level of translation in anaconda; perhaps we could only include the strings for these languages?
On that note, /usr/share/doc, /usr/share/man, and /usr/share/info could be removed from the installer image if they are present. That likely won't free a whole lot of space, but it's not nothing.
Thanks,
On Thu, 2022-12-08 at 12:50 -0500, David Cantrell wrote:
On Wed, Dec 07, 2022 at 04:42:05PM -0800, Adam Williamson wrote:
Hi folks! Today I woke up and found https://bugzilla.redhat.com/show_bug.cgi?id=2151495 , which diverted me down a bit of an "installer environment size" rabbit hole.
As of today, with that new dep in webkitgtk, Rawhide's network install images are 703M in size. Here's a potted history of network install image sizes:
Fedora Core 8: 103.2M (boot.iso 9.2M + stage2.img 94M) Fedora 13: 208M Fedora 17: 162M (last "old UI") Fedora 18: 294M (first "new UI") Fedora 23: 415M Fedora 28: 583M Fedora 33: 686M Fedora 37: 665M Fedora Rawhide: 703M
The installer does not really do much more in Rawhide than it did in FC8. Even after the UI rewrite in F18, we were only at 294M. Now the image is well over 2x as big and does...basically the same.
I take issue with this. It is not accurate to say that the installer now does not do much more than it did for Fedora Core 8. There is more to the installer than the UI.
Broadly speaking, a lot of the growth came from converging the runtime environment for the installer with the installed system. In Fedora Core 8 and previous releases, the "installer environment" was a unique and stripped down install. This was frustrating because it was effectively maintaining a small mini distro for the purposes of running the distro installer.
I meant it doesn't do much more in terms of what it achieves for the user. But we can also just take F18 as the base point, if you like. We're still over 2x as big as that was.
I think some curation on firmware could happen. I think probinson@ mentions it later in this thread. If there were a way to identify the firmware necessary for the installer environment that could probably simplify things.
We already do a lot of "identifying the firmware necessary for the installer environment", see my earlier mail about all the stuff lorax does here. We've done the easy part, unfortunately. The stuff that's left is stuff that is, in some sense, needed - graphics card and wireless adapter firmwares, mainly.
Other obvious things that take up a lot of space:
- /usr/lib/locale/locale-archive , from glibc-all-langpacks - this is
224M uncompressed. A quick test just compressing the file with xz on my system shows it compresses to around 11M, though, so that's probably all it adds up to after compression (the image is an xz-compressed squashfs)
Can this be installed compressed? I'm not sure it can.
The "compressed" size is the effective size we're concerned about, because the installer filesystem image is an xz-compressed squashfs. So 11M is already the "effective weight" of this file, I think - if I ran an image build with it removed, it'd probably be ~11M smaller.
- /usr/lib64/libLLVM-15.so, which is 114M on its own, compresses to
23M. We are, I think, basically stuck with this for mesa-dri-drivers , but does it have to be so *big*?
If mesa-dri-drivers is not required for installation, it could be removed from the installer environment.
It is required - we don't get any graphics without it.
- /usr/share/locale - 112M in total (uncompressed, not sure how much
compressed) of translated strings from a ton of packages. No idea how many of these are really *needed* in the installer environment. We can maybe come up with a way to have lorax strip some, if we can come up with a viable way to figure out which. Obviously-fairly-large ones are from gnupg2 and libgweather4. I do recall we have some logic somewhere to decide which languages have a certain level of translation in anaconda; perhaps we could only include the strings for these languages?
On that note, /usr/share/doc, /usr/share/man, and /usr/share/info could be removed from the installer image if they are present. That likely won't free a whole lot of space, but it's not nothing.
All of those are already stripped: https://github.com/weldr/lorax/blob/master/share/templates.d/99-generic/runt...
On Wed, Dec 07, 2022 at 04:42:05PM -0800, Adam Williamson wrote:
I've moaned about the sheer amount and size of firmware blobs in other forums before, but 214M compressed is *really* obnoxious. We must be able to do something to clean this up (further than it's already cleaned up - this is *after* we dropped low-hanging fruit like enterprise switch 'firmwares' and garbage like that; most of the remaining size seems to be huge amounts of probably-very-similar firmware files for AMD graphics adapters and Intel wireless adapters). I know some folks were trying to work on this (there was talk that we could drop quite a lot of files that would only be loaded by older kernels no longer in Fedora); any news on how far along that effort is?
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
Rich.
On Fri, Dec 09, 2022 at 12:04:24PM +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
I long for old school text mode installers ... At least you knew that the tab key would always work.
Rich.
On Fri, 2022-12-09 at 11:15 +0000, Richard W.M. Jones wrote:
On Fri, Dec 09, 2022 at 12:04:24PM +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
I long for old school text mode installers ... At least you knew that the tab key would always work.
Well, if you pass int.text on the boot command line, Anaconda will show you the TUI - which supports a sizeable sub-set of the GUI functionality. :)
Rich.
-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Fri, 2022-12-09 at 12:04 +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
Yeah, this is an awkward chicken-and-egg problem. Even if we assume you're on a wired network, kernel modules generally - AIUI - try to load the firmware once, on initial module load, and if they can't find it, just give up, right? So we still have an ordering problem: how can we delay the loading of modules that need firmware until the network is up for us to be able to access the firmware files?
Maybe I'm missing something that would help there, but it seems tricky...
Looking at sizes, iwlwifi firmware alone is 75M(!) ath10k is 6.8M, ath11k is 12M, ath6k is 812K, so that's nearly another 20M. brcm/ is another 6.4M and I *think* that's all wifi. There's a few other minor ones, but that's a little over 100M of just wifi, with Intel by a huge margin the worst offender.
Does anyone know anyone we can talk to at Intel about this? It's pretty obnoxious.
In terms of what the other big space takers are in general:
* amdgpu/ (AMD video cards) is ~20M * intel/ (mainly Intel bluetooth) is ~15M [0] * qed/ (some very high-end QLogic network cards) is ~10M [0] * i915/ (Intel video firmware) is 8.4M * mediatek/ is 7.7M [1] * qcom/ is 7.3M
Then it trails off from there. Just the wifi plus those 6 things are around 170M, so the large majority of all the space taken.
[0] No, we can't lose this - people install with Bluetooth mice/keyboards [1] For a quick win right now possibly we could assume nobody's going to use one of those as the interface for a Fedora install and drop that, not sure if it's a safe assumption [2] We could possibly lose a bunch of this stuff, I'll look into it
On Friday, December 9, 2022, Adam Williamson adamwill@fedoraproject.org wrote:
On Fri, 2022-12-09 at 12:04 +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
Yeah, this is an awkward chicken-and-egg problem. Even if we assume you're on a wired network, kernel modules generally - AIUI - try to load the firmware once, on initial module load, and if they can't find it, just give up, right? So we still have an ordering problem: how can we delay the loading of modules that need firmware until the network is up for us to be able to access the firmware files?
Maybe I'm missing something that would help there, but it seems tricky...
Looking at sizes, iwlwifi firmware alone is 75M(!) ath10k is 6.8M, ath11k is 12M, ath6k is 812K, so that's nearly another 20M. brcm/ is another 6.4M and I *think* that's all wifi. There's a few other minor ones, but that's a little over 100M of just wifi, with Intel by a huge margin the worst offender.
Does anyone know anyone we can talk to at Intel about this? It's pretty obnoxious.
In terms of what the other big space takers are in general:
- amdgpu/ (AMD video cards) is ~20M
- intel/ (mainly Intel bluetooth) is ~15M [0]
- qed/ (some very high-end QLogic network cards) is ~10M [0]
- i915/ (Intel video firmware) is 8.4M
- mediatek/ is 7.7M [1]
- qcom/ is 7.3M
Then it trails off from there. Just the wifi plus those 6 things are around 170M, so the large majority of all the space taken.
[0] No, we can't lose this - people install with Bluetooth mice/keyboards [1] For a quick win right now possibly we could assume nobody's going to use one of those as the interface for a Fedora install and drop that, not sure if it's a safe assumption
It's not given that AMD wifi is rebranded mediatek, meaning it will drop wifi for lots of newer AMD laptops.
On Fri, 2022-12-09 at 20:33 +0100, drago01 wrote:
On Friday, December 9, 2022, Adam Williamson adamwill@fedoraproject.org wrote:
On Fri, 2022-12-09 at 12:04 +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
Yeah, this is an awkward chicken-and-egg problem. Even if we assume you're on a wired network, kernel modules generally - AIUI - try to load the firmware once, on initial module load, and if they can't find it, just give up, right? So we still have an ordering problem: how can we delay the loading of modules that need firmware until the network is up for us to be able to access the firmware files?
Maybe I'm missing something that would help there, but it seems tricky...
Looking at sizes, iwlwifi firmware alone is 75M(!) ath10k is 6.8M, ath11k is 12M, ath6k is 812K, so that's nearly another 20M. brcm/ is another 6.4M and I *think* that's all wifi. There's a few other minor ones, but that's a little over 100M of just wifi, with Intel by a huge margin the worst offender.
Does anyone know anyone we can talk to at Intel about this? It's pretty obnoxious.
In terms of what the other big space takers are in general:
- amdgpu/ (AMD video cards) is ~20M
- intel/ (mainly Intel bluetooth) is ~15M [0]
- qed/ (some very high-end QLogic network cards) is ~10M [0]
- i915/ (Intel video firmware) is 8.4M
- mediatek/ is 7.7M [1]
- qcom/ is 7.3M
Then it trails off from there. Just the wifi plus those 6 things are around 170M, so the large majority of all the space taken.
[0] No, we can't lose this - people install with Bluetooth mice/keyboards [1] For a quick win right now possibly we could assume nobody's going to use one of those as the interface for a Fedora install and drop that, not sure if it's a safe assumption
It's not given that AMD wifi is rebranded mediatek, meaning it will drop wifi for lots of newer AMD laptops.
Sorry, I messed up my numbering there. That note was meant for the qed/ directory, not mediatek/ .
I've been working on this this morning. I'm pretty sure we can just drop every file but one in qed/ - it contains a lot of old versions that we don't need to care about any more. We can lose some stuff from mediatek/ - not any of the wifi stuff, but there's some firmware in there for ARM SoCs we do not even build the drivers for. I found a few other little cleanups, too.
I *think* we can fairly safely drop about 31M of iwlwifi firmwares from linux-firmware, I'm testing a PR for that right now. We could potentially drop even more in lorax (since we don't really need to support booting the current installer with an older kernel - that's a constraint on dropping things from the linux-firmware package too soon, as it would be a bit mean to break things for people booting older kernels on installed systems for some reason).
On Fri, 2022-12-09 at 09:48 -0800, Adam Williamson wrote:
On Fri, 2022-12-09 at 12:04 +0100, Florian Weimer wrote:
- Richard W. M. Jones:
You only need network / wifi firmware blobs (although I'm sure they are in themselves large) and then you can fetch anything else needed for the hardware including graphics, right?
I think you need graphics to set up wifi.
Yeah, this is an awkward chicken-and-egg problem. Even if we assume you're on a wired network, kernel modules generally - AIUI - try to load the firmware once, on initial module load, and if they can't find it, just give up, right? So we still have an ordering problem: how can we delay the loading of modules that need firmware until the network is up for us to be able to access the firmware files?
Maybe I'm missing something that would help there, but it seems tricky...
Looking at sizes, iwlwifi firmware alone is 75M(!) ath10k is 6.8M, ath11k is 12M, ath6k is 812K, so that's nearly another 20M. brcm/ is another 6.4M and I *think* that's all wifi. There's a few other minor ones, but that's a little over 100M of just wifi, with Intel by a huge margin the worst offender.
Does anyone know anyone we can talk to at Intel about this? It's pretty obnoxious.
In terms of what the other big space takers are in general:
- amdgpu/ (AMD video cards) is ~20M
- intel/ (mainly Intel bluetooth) is ~15M [0]
- qed/ (some very high-end QLogic network cards) is ~10M [0]
- i915/ (Intel video firmware) is 8.4M
- mediatek/ is 7.7M [1]
- qcom/ is 7.3M
Then it trails off from there. Just the wifi plus those 6 things are around 170M, so the large majority of all the space taken.
[0] No, we can't lose this - people install with Bluetooth mice/keyboards [1] For a quick win right now possibly we could assume nobody's going to use one of those as the interface for a Fedora install and drop that, not sure if it's a safe assumption [2] We could possibly lose a bunch of this stuff, I'll look into it
So since this turns out to be less important than I thought (thanks bcl for the correction) I won't poke it much further than I have today, but following up on the above, I've done a couple of PRs, one to strip more stuff in lorax: https://github.com/weldr/lorax/pull/1291 and one to dump a chunk of older iwlwifi firmwares: https://src.fedoraproject.org/rpms/linux-firmware/pull-request/9 those combined would get us some breathing room for a while...
anaconda-devel@lists.fedoraproject.org