Hi,
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
The goal is to move away from initrd images being generated on the installed machine. They are generated while building the kernel package instead. Main motivation for this move is to make the distro more robust and more secure.
When shipping the initrd as rpm it is possible to check it with the usual tools ('rpm --verify' for example). TPM measurements are much more useful because it is possible to pre-calculate the PCR values for a given kernel version.
When shipping a unified kernel image (containing kernel, initrd, cmdline and signature) we get the additional benefit that the initrd is covered by the signature so secure boot will actually be secure.
So, while unified kernels are clearly the better approach it is also the one which needs some changes in various packages. For an initrd image the hooks needed are in place thanks to CoreOS shipping initrd images today. Opt-in by install the sub-rpm and everything JustWorks[tm].
To make unified kernels work smoothly a number of changes are needed (beside the kernel rpm changes):
(1) Add support for unified kernels to the kernel update scripts. (/usr/lib/kernel/install.d/*).
(2) Add boot loader support for unified kernel images: (a) either switch to sd-boot which already supports this. (b) or add support to grub2 (improve blscfg downstream patch).
(3) Support /boot being vfat (depending on #2, sd-boot needs this).
(4) Remove configuration information (and secrets) from initrd images and kernel command line.
Most important item here is root the filesystem location, which should be doable using https://systemd.io/DISCOVERABLE_PARTITIONS/ for many use cases.
Can initially be handled in anaconda kickstart %post scripts. Long-term we need proper support in anaconda (and any other tool used to install or generate cloud images), especially if we want make unified kernel images the default some day.
(5) There might be more ...
I think the best way forward is to skip the initrd image interim step and try go straight to unified kernel image support, starting with virtual machines & cloud images, when things are working smoothly there go expand to cover more use cases. I think it makes sense to start with the kernel changes.
Comments? Reviews? Suggestions?
thanks & take care, Gerd
Daniel P. Berrangé (1): [testing] add a kernel-unified-virt sub-RPM
Gerd Hoffmann (2): [testing] virtual machine dracut config [testing] add a kernel-initrd-virt sub-RPM
dracut-virt.conf | 26 +++++++++++++++++++ kernel.spec | 65 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+) create mode 100644 dracut-virt.conf
Signed-off-by: Gerd Hoffmann kraxel@redhat.com --- dracut-virt.conf | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 dracut-virt.conf
diff --git a/dracut-virt.conf b/dracut-virt.conf new file mode 100644 index 000000000000..98b565349743 --- /dev/null +++ b/dracut-virt.conf @@ -0,0 +1,26 @@ +# generic + compressed please +hostonly="no" +compress="xz" + +# VMs can't update microcode anyway +early_microcode="no" + +# modules: basics +dracutmodules+=" base systemd systemd-initrd dracut-systemd dbus dbus-broker usrmount shutdown " + +# modules: storage support +dracutmodules+=" dm lvm rootfs-block fs-lib " + +# drivers: virtual buses, pci +drivers+=" virtio-pci virtio-mmio " # qemu-kvm +drivers+=" hv-vmbus pci-hyperv " # hyperv +drivers+=" xen-pcifront " # xen + +# drivers: storage +drivers+=" ahci nvme scsi-hd scsi-cd " # generic +drivers+=" virtio-blk virtio-scsi " # qemu-kvm +drivers+=" hv-storvsc " # hyperv +drivers+=" xen-blkfront " # xen + +# filesystems +filesystems+=" vfat ext4 xfs btrfs overlay "
Signed-off-by: Gerd Hoffmann kraxel@redhat.com --- kernel.spec | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/kernel.spec b/kernel.spec index 3a45243c0442..197846f4f834 100755 --- a/kernel.spec +++ b/kernel.spec @@ -817,6 +817,7 @@ Source82: update_scripts.sh
Source84: mod-internal.list Source85: mod-partner.list +Source86: dracut-virt.conf
Source100: rheldup3.x509 Source101: rhelkpatch1.x509 @@ -1296,6 +1297,9 @@ Requires: kernel-core-uname-r = %{KVERREL}\ %endif\ %{expand:%%kernel_debuginfo_package %{?1:%{1}}}\ %endif\ +%package %{?1:%{1}-}initrd-virt\ +Summary: %{variant_summary} initrd for VMs\ +Provides: installonlypkg(kernel)\ %{nil}
# @@ -1364,6 +1368,12 @@ Linux operating system. The kernel handles the basic functions of the operating system: memory allocation, process allocation, device input and output, etc.
+%description debug-initrd-virt +Prebuilt debug kernel initrd for VMs. + +%description initrd-virt +Prebuilt default kernel initrd for VMs. + %if %{with_ipaclones} %kernel_ipaclones_package %endif @@ -2131,6 +2141,15 @@ BuildKernel() { touch lib/modules/$KernelVer/modules.builtin fi
+ # pre-generate initrd + dracut --conf=%{SOURCE86} \ + --confdir=$(mktemp -d) \ + --verbose \ + --kver "$KernelVer" \ + --kmoddir lib/modules/$KernelVer \ + lib/modules/$KernelVer/initrd + lsinitrd lib/modules/$KernelVer/initrd + remove_depmod_files
# Go back and find all of the various directories in the tree. We use this @@ -3101,6 +3120,8 @@ fi %{expand:%%files -f debuginfo%{?3}.list %{?3:%{3}-}debuginfo}\ %endif\ %endif\ +%{expand:%%files %{?3:%{3}-}initrd-virt}\ +/lib/modules/%{KVERREL}%{?3:+%{3}}/initrd\ %if %{?3:1} %{!?3:0}\ %{expand:%%files %{3}}\ %endif\
From: Daniel P. Berrangé berrange@redhat.com
This sub package contains a unified kernel/initrd/cmdline image as an EFI binary, targetting booting of virtual machines.
Signed-off-by: Daniel P. Berrangé berrange@redhat.com --- kernel.spec | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+)
diff --git a/kernel.spec b/kernel.spec index 197846f4f834..a7129a9b98c5 100755 --- a/kernel.spec +++ b/kernel.spec @@ -88,6 +88,12 @@ Summary: The Linux kernel %global zipmodules 1 %endif
+%ifarch x86_64 +%global efiunified 1 +%else +%global efiunified 0 +%endif + %if %{zipmodules} %global zipsed -e 's/.ko$/.ko.xz/' # for parallel xz processes, replace with 1 to go back to single process @@ -690,6 +696,14 @@ BuildRequires: llvm BuildRequires: lld %endif
+%if %{efiunified} +BuildRequires: dracut +# For dracut UEFI unified binaries +BuildRequires: binutils +# For the initrd +BuildRequires: lvm2 +%endif + # Because this is the kernel, it's hard to get a single upstream URL # to represent the base without needing to do a bunch of patching. This # tarball is generated from a src-git tree. If you want to see the @@ -1300,6 +1314,11 @@ Requires: kernel-core-uname-r = %{KVERREL}\ %package %{?1:%{1}-}initrd-virt\ Summary: %{variant_summary} initrd for VMs\ Provides: installonlypkg(kernel)\ +%if %{efiunified}\ +%package %{?1:%{1}-}unified-virt\ +Summary: %{variant_summary} unified kernel/initrd for VMs\ +Provides: installonlypkg(kernel)\ +%endif\ %{nil}
# @@ -1374,6 +1393,14 @@ Prebuilt debug kernel initrd for VMs. %description initrd-virt Prebuilt default kernel initrd for VMs.
+%if %{efiunified} +%description debug-unified-virt +Prebuilt debug unified kernel/initrd for VMs. + +%description unified-virt +Prebuilt default unified kernel/initrd for VMs. +%endif + %if %{with_ipaclones} %kernel_ipaclones_package %endif @@ -2150,6 +2177,19 @@ BuildKernel() { lib/modules/$KernelVer/initrd lsinitrd lib/modules/$KernelVer/initrd
+%if %{efiunified} + # pre-generate unified kernel/initrd + dracut --conf=%{SOURCE86} \ + --confdir=$(mktemp -d) \ + --verbose \ + --uefi \ + --kernel-image `pwd`/lib/modules/$KernelVer/vmlinuz \ + --kernel-cmdline 'console=ttyS0 console=tty0' \ + --kver "$KernelVer" \ + --kmoddir lib/modules/$KernelVer \ + lib/modules/$KernelVer/vmlinuz.efi +%endif + remove_depmod_files
# Go back and find all of the various directories in the tree. We use this @@ -3122,6 +3162,10 @@ fi %endif\ %{expand:%%files %{?3:%{3}-}initrd-virt}\ /lib/modules/%{KVERREL}%{?3:+%{3}}/initrd\ +%if %{efiunified}\ +%{expand:%%files %{?3:%{3}-}unified-virt}\ +/lib/modules/%{KVERREL}%{?3:+%{3}}/%{?-k:%{-k*}}%{!?-k:vmlinuz.efi}\ +%endif\ %if %{?3:1} %{!?3:0}\ %{expand:%%files %{3}}\ %endif\
On Wed, 31 Aug 2022 14:46:15 +0200 Gerd Hoffmann kraxel@redhat.com wrote:
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
The goal is to move away from initrd images being generated on the installed machine. They are generated while building the kernel package instead. Main motivation for this move is to make the distro more robust and more secure.
A worthy goal.
Comments? Reviews? Suggestions?
How will this impact people who build a custom kernel locally? Will it still be possible?
On Wed, Aug 31, 2022 at 06:12:19AM -0700, stan via kernel wrote:
On Wed, 31 Aug 2022 14:46:15 +0200 Gerd Hoffmann kraxel@redhat.com wrote:
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
The goal is to move away from initrd images being generated on the installed machine. They are generated while building the kernel package instead. Main motivation for this move is to make the distro more robust and more secure.
A worthy goal.
Comments? Reviews? Suggestions?
How will this impact people who build a custom kernel locally? Will it still be possible?
Wouldn't be different than local builds today, unified kernels are not special and the signing process will work the same way it works for non-unified kernels.
Essentially you have the options to either sign your custom kernel with your own keys (and add them to the uefi/shim key database) or turn off secure boot.
take care, Gerd
On Thu, 1 Sep 2022 16:24:17 +0200 Gerd Hoffmann kraxel@redhat.com wrote:
Wouldn't be different than local builds today, unified kernels are not special and the signing process will work the same way it works for non-unified kernels.
Essentially you have the options to either sign your custom kernel with your own keys (and add them to the uefi/shim key database) or turn off secure boot.
Thanks.
On Wed, Aug 31, 2022 at 02:46:15PM +0200, Gerd Hoffmann wrote:
Hi,
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
I was going to open a merge request in Pagure which just has one patch doing #1 & #3 at the same time, but since you've started the discussion here, I'll just point to my branch:
https://src.fedoraproject.org/fork/berrange/rpms/kernel/commits/unified-kern...
with latest commit at time of writing being:
https://src.fedoraproject.org/fork/berrange/rpms/kernel/c/b055ea3932e48fff0d...
And builds available at
https://koji.fedoraproject.org/koji/taskinfo?taskID=91495327 https://copr.fedorainfracloud.org/coprs/berrange/efi-unified-kernel/builds/
Compared to the patch #3 Gerd included, I've made changes
- Added support for signing of the EFI images create
- Create both vmlinux-virt.efi and vilinuz-virt-verbose.efi the latter whose cmdline is tailored for debugging
- Install %ghost files at /boot/efi/EFI/Linux, similar to how existing kernels %ghost /boot/, so that RPM can validate disk space availability prior to install
The goal is to move away from initrd images being generated on the installed machine. They are generated while building the kernel package instead. Main motivation for this move is to make the distro more robust and more secure.
More specifically with public clouds most of the big vendors have a so called "Trusted VM" option which exposes EFI and vTPM, and allows for remote attestation of the VM. Azure at least has the attestation service integrated into their portal.
The RHEL images we provide for cloud today can be luacnhced in a "Trusted VM" setup, but there's no usable trust provided because the dynamically generated initrd and cmdline are impractical to attest to. This is further compounded by grub's practice of writing every single grub.conf statement into the PCRs, which effectively requires a grub simulator to validate.
More recently clouds have started to work on "Confidential VMs", which have a fairly high level of overlap with "Trusted VMs" in terms of what needs to be done to attest the confidentiality of the boot process. So again we need to measure and attest thue initrd + cmdline, and any bootloader config.
This is a long winded way of saying that the need for EFI unified kernel images is just one piece of the puzzle we're working on. To complement this we'll be looking for either having shim directly launch the kernel image, or request for sd-boot to be signed and use that, in both cases eliminating use of grub to simplify the meausrement+attestation problem significantly.
It is also likely that we'll be looking to make use of things like the kernel IMA framework to measure+attest to various aspects of the cloud disk and OS state.
When shipping the initrd as rpm it is possible to check it with the usual tools ('rpm --verify' for example). TPM measurements are much more useful because it is possible to pre-calculate the PCR values for a given kernel version.
When shipping a unified kernel image (containing kernel, initrd, cmdline and signature) we get the additional benefit that the initrd is covered by the signature so secure boot will actually be secure.
So, while unified kernels are clearly the better approach it is also the one which needs some changes in various packages. For an initrd image the hooks needed are in place thanks to CoreOS shipping initrd images today. Opt-in by install the sub-rpm and everything JustWorks[tm].
To make unified kernels work smoothly a number of changes are needed (beside the kernel rpm changes):
(1) Add support for unified kernels to the kernel update scripts. (/usr/lib/kernel/install.d/*).
(2) Add boot loader support for unified kernel images: (a) either switch to sd-boot which already supports this. (b) or add support to grub2 (improve blscfg downstream patch).
(3) Support /boot being vfat (depending on #2, sd-boot needs this).
Technically in the cloud image scenario we don't need to especially care about /boot being a dedicated partition. We could do everything exclusively in the /boot/efi partition which is already vfat, and not bother creating any /boot partition, since we can ensure /boot/efi is large enough.
If we forsee the unified EFI kenrels being useful for bare metal, however, then use of /boot as vfat becomes more important, as we can't assume the hardware vendor's pre-created /boot/efi is sufficiently large.
(4) Remove configuration information (and secrets) from initrd images and kernel command line.
Most important item here is root the filesystem location, which should be doable using https://systemd.io/DISCOVERABLE_PARTITIONS/ for many use cases. Can initially be handled in anaconda kickstart %post scripts. Long-term we need proper support in anaconda (and any other tool used to install or generate cloud images), especially if we want make unified kernel images the default some day.
(5) There might be more ...
In the kernel.spec changes I link to earlier, I've actually proposed creating two distinct EFI images. Under SecureBoot, the users won't have the option to edit the cmdline, since it is embedded in the EFI image and measured. Furthermore with Confidential VMs, the emulated keyboard, serial port and VGA output are not trustworthy, so care has to be taken with any interactive process during boot, including any interaction with a boot loader. Selectinmg between multiple pre-defined kenrel entries is fine, editting cmdline is not viable.
In normal operation this isn't a big issue, as its fine to just hardcode a quiet, graphical boot. When things go wrong, however, it is nice to be able to boot with 'debug' and rhgb turned off.
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
* vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
* vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Even when following the system discoverable partitions spec, we need a mechanism to attest that the root filesystem that was discovered and mounted matches the one we expect. Our current thought is that this is likely to involve TPM PCR measurements being logged by systemd and libcryptsetup, as well as some of the kernel IMA providers. This is being discussed with systemd upstream at
https://github.com/systemd/systemd/issues/24503
I raise this because for the kernel IMA support, we're likely to need to add further kernel cmdline parameters beyond those currently shown in these patches (ie ima=on at the very least), as well as bundling an /etc/ima/ima-policy file into the initrd.
I think the best way forward is to skip the initrd image interim step and try go straight to unified kernel image support, starting with virtual machines & cloud images, when things are working smoothly there go expand to cover more use cases. I think it makes sense to start with the kernel changes.
Comments? Reviews? Suggestions?
In terms of the distro maint burden, shipping these pre-built EFI images introduced new deps on the kernel build process
BuildRequires: dracut BuildRequires: binutils BuildRequires: lvm2
In theory, any time one of those packages (or an existing kernel deps) has a change, it might impact the content that is bundled in the initrd that is prebuilt / bundled with the EFI image. That in turn could mean that extra kernel RPM re-builds are needed, simply because a 3rd party dep had an important change that we need to get into the initrd.
In practice, my impression is that the kernel gets rebuilt frequently enough that the content bundled into the initrds is already going to keeping sufficiently updated. Even if some package change did impact the initrd content, it would almost always be possible for users to wait until the next normal kernel rebuild point to get that into the initrds. High priority CVEs feels like the only scenario that might force an extra kernel RPM rebuild that would not otherwise have been needed.
So overall, I feel like this addition ought not to introduce a notable negative impact on kernel RPM maint, while at the same it is enables us to close the SecureBoot measurement hole our distro (and essentially every other Linux distro) has suffered with for years.
FWIW, as a reference/comparison point, in testing out Azure's support for AMD SEV-SNP) confidential virtualization, we've learnt that Ubuntu have also chosen to go down the route of using a EFI unified kernel image, in their case, directly booted by shim with neither grub nor sd-boot involved. Their image looked like a one-off special, but it is likely that will apply going forwards, since there are few other practical ways to deal with the measurement & attestation needs in a seemless way.
With regards, Daniel
Hi,
(3) Support /boot being vfat (depending on #2, sd-boot needs this).
Technically in the cloud image scenario we don't need to especially care about /boot being a dedicated partition. We could do everything exclusively in the /boot/efi partition which is already vfat, and not bother creating any /boot partition, since we can ensure /boot/efi is large enough.
If we forsee the unified EFI kenrels being useful for bare metal, however, then use of /boot as vfat becomes more important, as we can't assume the hardware vendor's pre-created /boot/efi is sufficiently large.
Didn't want to open that discussion right now. My impression is that Gedora & RHEL settled on the approach to have both /boot and /boot/efi everywhere because some cases require this and having only a single scheme simplifies development + testing.
Changing this looks not that easy to me, we have RPMs dropping files into /boot/efi/EFI and I suspect this is hard-wired in various places in anaconda and elsewhere.
So, yes, for cloud images we don't really need this as we can make the ESP as large as we want, but I have my doubts that deriving from the standard way fedora handles things gives us enough benefits to be worth the trouble.
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
take care, Gerd
On Fri, Sep 02, 2022 at 11:21:03AM +0200, Gerd Hoffmann wrote:
Hi,
(3) Support /boot being vfat (depending on #2, sd-boot needs this).
Technically in the cloud image scenario we don't need to especially care about /boot being a dedicated partition. We could do everything exclusively in the /boot/efi partition which is already vfat, and not bother creating any /boot partition, since we can ensure /boot/efi is large enough.
If we forsee the unified EFI kenrels being useful for bare metal, however, then use of /boot as vfat becomes more important, as we can't assume the hardware vendor's pre-created /boot/efi is sufficiently large.
Didn't want to open that discussion right now. My impression is that Gedora & RHEL settled on the approach to have both /boot and /boot/efi everywhere because some cases require this and having only a single scheme simplifies development + testing.
Changing this looks not that easy to me, we have RPMs dropping files into /boot/efi/EFI and I suspect this is hard-wired in various places in anaconda and elsewhere.
Hmm, yes, it is not worth trying to track down all those places.
So, yes, for cloud images we don't really need this as we can make the ESP as large as we want, but I have my doubts that deriving from the standard way fedora handles things gives us enough benefits to be worth the trouble.
Agreed, removing the restrictions preventing vfat on /boot are likely the least effort change.
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Hmm, interesting idea. That would be attractive because all the different .cmdline variants would be trivially covered by the SecureBoot signature still. Would need enhancement in the bootloader(s) supporting unified image.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
The most flexible as it avoids need to predict what users might like to use, and also avoids the combinatorial expansion problem from embedding multiple .cmdline, which becomes more important as the set of safe-to-use options becomes larger. Would need the bootloader to enforce this, if we're to avoid having to attest the cmdline specifically after boot.
Still I like the multiple .cmdline sections as it lets the bootload give a simple menu choice without needing more interactive editting.
With regards, Daniel
Incidentally if anyone reading is a moderator for this mailing list, the message Gerd is replying to appears still pending in the moderation queue as the list didn't allow non-members to post...
On Fri, Sep 02, 2022 at 10:40:41AM +0100, Daniel P. Berrangé wrote:
On Fri, Sep 02, 2022 at 11:21:03AM +0200, Gerd Hoffmann wrote:
Hi,
(3) Support /boot being vfat (depending on #2, sd-boot needs this).
Technically in the cloud image scenario we don't need to especially care about /boot being a dedicated partition. We could do everything exclusively in the /boot/efi partition which is already vfat, and not bother creating any /boot partition, since we can ensure /boot/efi is large enough.
If we forsee the unified EFI kenrels being useful for bare metal, however, then use of /boot as vfat becomes more important, as we can't assume the hardware vendor's pre-created /boot/efi is sufficiently large.
Didn't want to open that discussion right now. My impression is that Gedora & RHEL settled on the approach to have both /boot and /boot/efi everywhere because some cases require this and having only a single scheme simplifies development + testing.
Changing this looks not that easy to me, we have RPMs dropping files into /boot/efi/EFI and I suspect this is hard-wired in various places in anaconda and elsewhere.
Hmm, yes, it is not worth trying to track down all those places.
So, yes, for cloud images we don't really need this as we can make the ESP as large as we want, but I have my doubts that deriving from the standard way fedora handles things gives us enough benefits to be worth the trouble.
Agreed, removing the restrictions preventing vfat on /boot are likely the least effort change.
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Hmm, interesting idea. That would be attractive because all the different .cmdline variants would be trivially covered by the SecureBoot signature still. Would need enhancement in the bootloader(s) supporting unified image.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
The most flexible as it avoids need to predict what users might like to use, and also avoids the combinatorial expansion problem from embedding multiple .cmdline, which becomes more important as the set of safe-to-use options becomes larger. Would need the bootloader to enforce this, if we're to avoid having to attest the cmdline specifically after boot.
Still I like the multiple .cmdline sections as it lets the bootload give a simple menu choice without needing more interactive editting.
With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
With regards, Daniel
On Fri, Sep 02, 2022 at 11:21:03AM +0200, Gerd Hoffmann wrote:
Hi,
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
FYI, I filed an RFE with systemd to get their opinion on the idea of alternative cmdline entries, and/or validated option lists:
https://github.com/systemd/systemd/issues/24539
With regards, Daniel
On Fri, Sep 02, 2022 at 12:07:38PM +0100, Daniel P. Berrangé wrote:
On Fri, Sep 02, 2022 at 11:21:03AM +0200, Gerd Hoffmann wrote:
Hi,
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
FYI, I filed an RFE with systemd to get their opinion on the idea of alternative cmdline entries, and/or validated option lists:
So, how move forward with this best? Drop the verbose variant for now, add support for that once systemd-stub support for multiple command lines is sorted upstream?
Should we have a Fedora feature page for unified kernel support?
take care, Gerd
On Tue, Sep 20, 2022 at 03:05:17PM +0200, Gerd Hoffmann wrote:
On Fri, Sep 02, 2022 at 12:07:38PM +0100, Daniel P. Berrangé wrote:
On Fri, Sep 02, 2022 at 11:21:03AM +0200, Gerd Hoffmann wrote:
Hi,
Thus my patch proposed two images, to be distributed in the same 'kernel-virt-unified' sub-RPM.
vmlinuz-virt.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0 quiet rhgb'
vmlinuz-virt-verbose.efi created using dracut arg
--kernel-cmdline 'console=ttyS0 console=tty0'
Hmm, I think we should look for a more elegant solution to this problem than having two large, 99% identical images.
One option could be to have systemd-stub support multiple .cmdline sections and allowing the user to pick one of them.
Another option could be to have a whitelist of options the user is allowed to add/remove which then could include stuff like 'console=' and 'quiet'.
FYI, I filed an RFE with systemd to get their opinion on the idea of alternative cmdline entries, and/or validated option lists:
So, how move forward with this best? Drop the verbose variant for now, add support for that once systemd-stub support for multiple command lines is sorted upstream?
Should we have a Fedora feature page for unified kernel support?
I'm not sure we especially want to publicise unified kernel images as a standalone thing to users, as it is more of just a building block.
If we're going to show any "feature", I think it probably ought to be oriented around a higher level user solution, of which unified kernel images are just one piece of the puzzle.
eg I would see "Trusted boot for cloud images" as a feature, by which I mean publishing cloud images capable of using SecureBoot and vTPM, to do attestable boot on UEFI, without the unsigned initrd hole present as today.
So, this would mean uing unified kernel images, either directly launched from shim, or with sd-boot in there to provide a UI selection. Either way not grub, since it is impractical to do attestation with grub's use of PCRs. Also likely mean use of discoverable partitions.
I know there's been some desire to move Fedora cloud images to be UEFI only, so it might dovetail with that to some degree. It would likely touch kernel, anaconda, systemd and cloud images, so there's some coordination & discussion needed to agree on the big picture.
With regards, Daniel
Hi,
Should we have a Fedora feature page for unified kernel support?
I'm not sure we especially want to publicise unified kernel images as a standalone thing to users, as it is more of just a building block.
If we're going to show any "feature", I think it probably ought to be oriented around a higher level user solution, of which unified kernel images are just one piece of the puzzle.
Yes. Just the kernel package updates are not enough. The bare minimum which I think makes sense as feature is:
(1) adding the unified kernel sub-rpm discussed here, and (2) bootloader support (be that sd-boot or grub or both), and (3) kernel install script support (so 'dnf update' kernel updates work).
With that in place it should be possible to install VMs using the unified kernel sub-rpm and everything works as it did before, even when it requires some hacks here and there, such as tagging partitions manually or in kickstart %post for https://systemd.io/DISCOVERABLE_PARTITIONS/
The above will already plug the initrd hole.
eg I would see "Trusted boot for cloud images" as a feature, by which I mean publishing cloud images capable of using SecureBoot and vTPM, to do attestable boot on UEFI, without the unsigned initrd hole present as today.
So, this would mean uing unified kernel images, either directly launched from shim, or with sd-boot in there to provide a UI selection. Either way not grub, since it is impractical to do attestation with grub's use of PCRs. Also likely mean use of discoverable partitions.
Yep, that all makes sense, but I'd tend to not add that as 'required' to the feature page. My list of nice-to-have optional stuff:
(4) attestable boot (i.e. no grub b/c PCR mess). (5) provide pre-calculated PCRs (could be published as kernel-pcrs sub-rpm). (6) add discoverable partitions to anaconda (so we don't need %post hacks). (7) add discoverable partitions to image builder (and whatever else generates cloud images). (8) switch cloud images to unified kernels.
I think I've seen fedora feature pages with optional sub-goals before, even though I can't find one right now, so I think it should be possible to handle things that way.
I know there's been some desire to move Fedora cloud images to be UEFI only, so it might dovetail with that to some degree.
Hack grub so it can pull out kernel + initrd out of a unified kernel image and boot unified kernel images in bios mode that way should be possible. Would allow to migrate cloud images to unified kernels without requiring UEFI.
It would likely touch kernel, anaconda, systemd and cloud images, so there's some coordination & discussion needed to agree on the big picture.
Sure. Having a feature page drawing that big picture would be helpful for these discussions I think ...
take care, Gerd
On Wed, Sep 21, 2022 at 4:48 AM Gerd Hoffmann kraxel@redhat.com wrote:
Hi,
Should we have a Fedora feature page for unified kernel support?
I'm not sure we especially want to publicise unified kernel images as a standalone thing to users, as it is more of just a building block.
If we're going to show any "feature", I think it probably ought to be oriented around a higher level user solution, of which unified kernel images are just one piece of the puzzle.
Yes. Just the kernel package updates are not enough. The bare minimum which I think makes sense as feature is:
(1) adding the unified kernel sub-rpm discussed here, and (2) bootloader support (be that sd-boot or grub or both), and (3) kernel install script support (so 'dnf update' kernel updates work).
With that in place it should be possible to install VMs using the unified kernel sub-rpm and everything works as it did before, even when it requires some hacks here and there, such as tagging partitions manually or in kickstart %post for https://systemd.io/DISCOVERABLE_PARTITIONS/
The above will already plug the initrd hole.
From a kernel standpoint, I am interested in this. We do need to ensure that we are doing it in a way that is easily extensible to not just cloud images in the future.
eg I would see "Trusted boot for cloud images" as a feature, by which I mean publishing cloud images capable of using SecureBoot and vTPM, to do attestable boot on UEFI, without the unsigned initrd hole present as today.
So, this would mean uing unified kernel images, either directly launched from shim, or with sd-boot in there to provide a UI selection. Either way not grub, since it is impractical to do attestation with grub's use of PCRs. Also likely mean use of discoverable partitions.
Yep, that all makes sense, but I'd tend to not add that as 'required' to the feature page. My list of nice-to-have optional stuff:
(4) attestable boot (i.e. no grub b/c PCR mess). (5) provide pre-calculated PCRs (could be published as kernel-pcrs sub-rpm). (6) add discoverable partitions to anaconda (so we don't need %post hacks). (7) add discoverable partitions to image builder (and whatever else generates cloud images). (8) switch cloud images to unified kernels.
I think I've seen fedora feature pages with optional sub-goals before, even though I can't find one right now, so I think it should be possible to handle things that way.
I know there's been some desire to move Fedora cloud images to be UEFI only, so it might dovetail with that to some degree.
Hack grub so it can pull out kernel + initrd out of a unified kernel image and boot unified kernel images in bios mode that way should be possible. Would allow to migrate cloud images to unified kernels without requiring UEFI.
It would likely touch kernel, anaconda, systemd and cloud images, so there's some coordination & discussion needed to agree on the big picture.
Sure. Having a feature page drawing that big picture would be helpful for these discussions I think ...
Definitely a system wide change. While there are some logistics to work out on the kernel side, most of the work is packaging. I would guess timelines will need to be defined by the teams where code is needed.
Justin
take care, Gerd _______________________________________________ kernel mailing list -- kernel@lists.fedoraproject.org To unsubscribe send an email to kernel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kernel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Hi,
It would likely touch kernel, anaconda, systemd and cloud images, so there's some coordination & discussion needed to agree on the big picture.
Sure. Having a feature page drawing that big picture would be helpful for these discussions I think ...
Definitely a system wide change. While there are some logistics to work out on the kernel side, most of the work is packaging. I would guess timelines will need to be defined by the teams where code is needed.
https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1
Comments? Something important missing?
(First time I do a change page ...)
take care, Gerd
Hi,
Sure. Having a feature page drawing that big picture would be helpful for these discussions I think ...
Definitely a system wide change. While there are some logistics to work out on the kernel side, most of the work is packaging. I would guess timelines will need to be defined by the teams where code is needed.
https://fedoraproject.org/wiki/Changes/Unified_Kernel_Support_Phase_1
Resuming that discussion. State of affairs:
Test packages are available: https://copr.fedorainfracloud.org/coprs/kraxel/unified.kernel/
Kickstart install files and helper scripts: https://gitlab.com/kraxel/fedora-uki
Prebuilt qemu VM test images: https://www.kraxel.org/fedora-uki/
Right now I'm trying to figure how to handle kernel installs best. Currently the kickstart files use ...
touch /etc/kernel/install.d/20-grub.install touch /etc/kernel/install.d/50-dracut.install
.. to avoid kernel-core post-install creating a loader/entries/*.conf snippet and dracut creating an (unused) initrd. Which is ok for a Proof-of-Concept, but clearly a non-starter as long-term solution.
I've played around a bit with kernel-install script plugins trying to detect the presence of a unified kernel image and change behavior then. That feels a bit hackish though, and I suspect making that robust and covering all corner cases (like switching between unified and non-unified kernels) will be next to impossible.
I think a better way forward is to have strictly separate rpms for kernels and modules. Today 'kernel-core' has both the (non-unified) kernel binary and the essential modules. Due to the modules being in there it must be installed even if we don't want use the kernel binary.
So how about splitting 'kernel-core' into a modules and a kernel binary package? Like this:
kernel-modules-core - the modules from current 'kernel-core' kernel-modules-standard - current 'kernel-modules' renamed (maybe skip rename, but I think it'll be less confusing that way). kernel-modules-{extra,internal} - no changes
kernel-binary-bare - the kernel binary from current 'kernel-core' kernel-binary-uki-virt - unified kernel (same as 'kernel-unified-virt' in the test package repo).
kernel-{doc,tools*,headers,...} - no changes
With that in place kernel-binary-* packages are self-contained. They just install/remove the kernel from /boot and/or ESP on install/remove. No need to check whenever *other* packages are present and change behavior based on that.
Comments?
take care, Gerd
Hey Gerd,
Thanks for this changeset.
On 8/31/22 08:46, Gerd Hoffmann wrote:
Hi,
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
No critical objections from me, however, just a few long-term questions about this approach.
How are you going to prevent feature-creep in the initrd? What happens when someone asks us to include "driver X" in this general initrd? How do we determine whether or "driver X" is or is not appropriate for inclusion?
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
These will be built all the time. I'm worried about storage, etc., when adding new sub-packages. Having said that, I do really like the idea ;) and would definitely argue that it is worth it.
The goal is to move away from initrd images being generated on the installed machine. They are generated while building the kernel package instead. Main motivation for this move is to make the distro more robust and more secure.
Completely agreed and a great goal.
When shipping the initrd as rpm it is possible to check it with the usual tools ('rpm --verify' for example). TPM measurements are much more useful because it is possible to pre-calculate the PCR values for a given kernel version.
When shipping a unified kernel image (containing kernel, initrd, cmdline and signature) we get the additional benefit that the initrd is covered by the signature so secure boot will actually be secure.
Thanks again,
P.
On Thu, Sep 01, 2022 at 10:37:03AM -0400, Prarit Bhargava wrote:
Hey Gerd,
Thanks for this changeset.
On 8/31/22 08:46, Gerd Hoffmann wrote:
Hi,
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
No critical objections from me, however, just a few long-term questions about this approach.
How are you going to prevent feature-creep in the initrd? What happens when someone asks us to include "driver X" in this general initrd? How do we determine whether or "driver X" is or is not appropriate for inclusion?
If we limit the targetted usage to only be VMs, then we limit the scope of drivers significantly, since there's a small finite set of hypervisor we care about providing disk images for (VMware, HyperV, Xen, KVM), and thus we know what drivers are needed in order to be able to get out of the initrd into the root fs.
If we extend to usage in bare metal scope of drivers becomes a little more open ended potentially, and I don't have a definite answer for what the criteria would be in that case.
Worth noting though that systemd has a extension mechanism
https://www.freedesktop.org/software/systemd/man/systemd-sysext.html
that can be used to augment the pre-built initrd with further content. Fedora could ship certain drivers are separate extension images, or users could build their own extension images (though they would need to deal with their own SecureBoot keys in the latter case).
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
These will be built all the time. I'm worried about storage, etc., when adding new sub-packages. Having said that, I do really like the idea ;) and would definitely argue that it is worth it.
For reference, in the kernel-virt-unified sub-RPM that my patches build, I'm created two EFI images, each of which is 40 MB in size, so total of 80 MB size for that new sub-RPM. When -debug builds are enabled, you get the same again for kernel-debug-virt-unified, so 160 MB total.
THe overall kernel build output for x86_64 is 2.1 GB though, largely thanks to the enourmous -debuginfo packages, which dwarfs the extra 160 MB for these EFI images.
With regards, Daniel
On Thu, Sep 1, 2022 at 9:58 AM Daniel P. Berrangé berrange@redhat.com wrote:
On Thu, Sep 01, 2022 at 10:37:03AM -0400, Prarit Bhargava wrote:
Hey Gerd,
Thanks for this changeset.
On 8/31/22 08:46, Gerd Hoffmann wrote:
Hi,
Here is a little patch series to kick off a discussion on pre-generated initrd images and unified kernels. Lets start with a description of the patches:
Patch #1 adds a dracut config file, targeting virtual machines. Given that most physical machines have either sata or nvme disks these days it probably boots most physical systems too.
No critical objections from me, however, just a few long-term questions about this approach.
How are you going to prevent feature-creep in the initrd? What happens when someone asks us to include "driver X" in this general initrd? How do we determine whether or "driver X" is or is not appropriate for inclusion?
If we limit the targetted usage to only be VMs, then we limit the scope of drivers significantly, since there's a small finite set of hypervisor we care about providing disk images for (VMware, HyperV, Xen, KVM), and thus we know what drivers are needed in order to be able to get out of the initrd into the root fs.
If we extend to usage in bare metal scope of drivers becomes a little more open ended potentially, and I don't have a definite answer for what the criteria would be in that case.
Worth noting though that systemd has a extension mechanism
https://www.freedesktop.org/software/systemd/man/systemd-sysext.html
that can be used to augment the pre-built initrd with further content. Fedora could ship certain drivers are separate extension images, or users could build their own extension images (though they would need to deal with their own SecureBoot keys in the latter case).
There are some other options here for longer term, ways to hopefully make this easier to deal with. I like the concept of this, and it is rather surprising just how many "kernel bugs" end up being badly generated initramfs on a local machine, which this can also help alleviate. I need to think it through a bit, but my initial reaction is that the advantages outweigh the disadvantages.
Patch #2 adds a sub-package with an initrd image.
Patch #3 adds a sub-package with an unified kernel.
These will be built all the time. I'm worried about storage, etc., when adding new sub-packages. Having said that, I do really like the idea ;) and would definitely argue that it is worth it.
For reference, in the kernel-virt-unified sub-RPM that my patches build, I'm created two EFI images, each of which is 40 MB in size, so total of 80 MB size for that new sub-RPM. When -debug builds are enabled, you get the same again for kernel-debug-virt-unified, so 160 MB total.
THe overall kernel build output for x86_64 is 2.1 GB though, largely thanks to the enourmous -debuginfo packages, which dwarfs the extra 160 MB for these EFI images.
The debuginfo rpms are not passed around nearly as much in many contexts, so it is a significant change. I am not saying it is not worthwhile, just saying the size difference does need to be kept in context (how many people mirror repos, but not debuginfo?)
Justin
kernel@lists.fedoraproject.org