This is a follow-on from e2d10a04. To reduce the change from the previous behaviour and minimize the chance of any strange corner-case breakage, this sorts the disk list, rather than filtering it.
It also now checks the format_type constraint as well as the mountpoint constraint. There are platforms (PPC) which use a volume as the stage1 target, but don't require it to be mounted; this handles those as well as platforms which require the target volume to be mounted to a specific location.
It's also slightly re-arranged to flow better with the existing check that filters out protected, hidden and unusable disks.
The overall behaviour change can be described as: on platforms which require the stage1 target to be a volume, use the first enumerated disk which contains a potentially valid volume as the 'boot disk' if the user does not explicitly specify one, rather than just using the first enumerated disk and failing if it does not contain a valid stage1 target volume (#1168118).
This also debug logs the result of the 'disks with potential stage1 targets' check. --- pyanaconda/kickstart.py | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/pyanaconda/kickstart.py b/pyanaconda/kickstart.py index d44b349..4aa4fd8 100644 --- a/pyanaconda/kickstart.py +++ b/pyanaconda/kickstart.py @@ -379,22 +379,30 @@ class Bootloader(commands.bootloader.F21_Bootloader): if self.timeout is not None: storage.bootloader.timeout = self.timeout
- # try to reduce the set of possible 'boot disks' to the disks containing - # a valid stage1 mount point, if there are any, because those disks - # should be searched for a stage1_valid device - disks = storage.disks - if platform.bootStage1ConstraintDict["mountpoints"]: - disks = [p.disk for p in storage.devices if getattr(p.format, "mountpoint", None) in platform.bootStage1ConstraintDict["mountpoints"]] - if not disks: - # but if not, just go ahead and use the set of all disks or - # else we'll error out later - disks = storage.disks - # Throw out drives specified that don't exist or cannot be used (iSCSI # device on an s390 machine) - disk_names = [d.name for d in disks - if not d.format.hidden and not d.protected and - (not blivet.arch.isS390() or not isinstance(d, blivet.devices.iScsiDiskDevice))] + disks = [d for d in storage.disks + if not d.format.hidden and not d.protected and + (not blivet.arch.isS390() or not isinstance(d, blivet.devices.iScsiDiskDevice))] + if platform.bootStage1ConstraintDict["format_types"]: + # sort the list of disks with those that contain possibly-valid + # stage1 partitions first, so we'll pick a disk with one as + # bootDrive if it has not been explicitly set: RHBZ #1168118 + s1vols = [vol for vol in storage.devices if + getattr(vol.format, "type", None) in + platform.bootStage1ConstraintDict["format_types"]] + if platform.bootStage1ConstraintDict["mountpoints"]: + # further sort by volumes with valid mount points, if + # the platform requires this + s1disks = [vol.disk for vol in s1vols if + getattr(vol.format, "mountpoint", None) in + platform.bootStage1ConstraintDict["mountpoints"]] + else: + s1disks = [vol.disk for vol in s1vols] + log.debug("Disks with potentially valid stage1 volumes: %s", + ' '.join([d.name for d in s1disks])) + disks.sort(key=lambda x: x in s1disks, reverse=True) + disk_names = [d.name for d in disks] diskSet = set(disk_names)
for drive in self.driveorder[:]:
On Fri, 2014-11-28 at 13:51 -0800, Adam Williamson wrote:
This is a follow-on from e2d10a04. To reduce the change from the previous behaviour and minimize the chance of any strange corner-case breakage, this sorts the disk list, rather than filtering it.
It also now checks the format_type constraint as well as the mountpoint constraint. There are platforms (PPC) which use a volume as the stage1 target, but don't require it to be mounted; this handles those as well as platforms which require the target volume to be mounted to a specific location.
After a mere six hours or so of watching a veeeeery sloooooow qemu ppc64 install grind away, I can say this looks like it works for the PPC64 case (as a test of the 'stage1 is a partition with no mountpoint' path): I successfully did a custom part install placing the prepboot partition on /dev/sdb without specifying a boot disk, as expected anaconda picked /dev/sdb as the 'boot disk' and there were no errors. Bootloader installation succeeded and the installed system boots from sdb. I also checked that without my change, this case behaves just like the UEFI case initially identified.
I'll probably run a couple of kickstart tests, but I think this is pretty sane and safe now.
On Fri, 2014-11-28 at 13:51 -0800, Adam Williamson wrote:
This is a follow-on from e2d10a04. To reduce the change from the previous behaviour and minimize the chance of any strange corner-case breakage, this sorts the disk list, rather than filtering it.
It also now checks the format_type constraint as well as the mountpoint constraint. There are platforms (PPC) which use a volume as the stage1 target, but don't require it to be mounted; this handles those as well as platforms which require the target volume to be mounted to a specific location.
It's also slightly re-arranged to flow better with the existing check that filters out protected, hidden and unusable disks.
The overall behaviour change can be described as: on platforms which require the stage1 target to be a volume, use the first enumerated disk which contains a potentially valid volume as the 'boot disk' if the user does not explicitly specify one, rather than just using the first enumerated disk and failing if it does not contain a valid stage1 target volume (#1168118).
This also debug logs the result of the 'disks with potential stage1 targets' check.
I like this improvement with two comments: 1) using the word 'volume' and accordingly named variables makes me think these are (only) LVM or BTRFS volumes which is not true. 'partition' is also problematic, because LVM could, in some cases, be used, but it's more typical to use partitions for stage1 devices. Or what about 'device' or 'dev'? That fits best I'd say.
2) see one neat pick below:
pyanaconda/kickstart.py | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/pyanaconda/kickstart.py b/pyanaconda/kickstart.py index d44b349..4aa4fd8 100644 --- a/pyanaconda/kickstart.py +++ b/pyanaconda/kickstart.py @@ -379,22 +379,30 @@ class Bootloader(commands.bootloader.F21_Bootloader): if self.timeout is not None: storage.bootloader.timeout = self.timeout
# try to reduce the set of possible 'boot disks' to the disks containing
# a valid stage1 mount point, if there are any, because those disks
# should be searched for a stage1_valid device
disks = storage.disks
if platform.bootStage1ConstraintDict["mountpoints"]:
disks = [p.disk for p in storage.devices if getattr(p.format, "mountpoint", None) in platform.bootStage1ConstraintDict["mountpoints"]]
if not disks:
# but if not, just go ahead and use the set of all disks or
# else we'll error out later
disks = storage.disks
# Throw out drives specified that don't exist or cannot be used (iSCSI # device on an s390 machine)
disk_names = [d.name for d in disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d, blivet.devices.iScsiDiskDevice))]
disks = [d for d in storage.disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d, blivet.devices.iScsiDiskDevice))]
if platform.bootStage1ConstraintDict["format_types"]:
# sort the list of disks with those that contain possibly-valid
# stage1 partitions first, so we'll pick a disk with one as
# bootDrive if it has not been explicitly set: RHBZ #1168118
s1vols = [vol for vol in storage.devices if
getattr(vol.format, "type", None) in
platform.bootStage1ConstraintDict["format_types"]]
This, after having a better name, could be a generator instead of a list comprehension (using parentheses instead of square brackets).
In relation to this I'd like to mention that your Python skills became waaaay better over the recent months. Nice!
On Mon, 2014-12-01 at 09:12 +0100, Vratislav Podzimek wrote:
This also debug logs the result of the 'disks with potential stage1 targets' check.
I like this improvement with two comments:
- using the word 'volume' and accordingly named variables makes me
think these are (only) LVM or BTRFS volumes which is not true. 'partition' is also problematic, because LVM could, in some cases, be used, but it's more typical to use partitions for stage1 devices. Or what about 'device' or 'dev'? That fits best I'd say.
I thought 'volume' was more or less the standard term anaconda uses in this case - you know, when the first word that comes into your head is 'partition', but it's not actually quite correct =). So that's why I used it.
- see one neat pick below:
if platform.bootStage1ConstraintDict["format_types"]:
# sort the list of disks with those that contain possibly-valid
# stage1 partitions first, so we'll pick a disk with one as
# bootDrive if it has not been explicitly set: RHBZ #1168118
s1vols = [vol for vol in storage.devices if
getattr(vol.format, "type", None) in
platform.bootStage1ConstraintDict["format_types"]]
This, after having a better name, could be a generator instead of a list comprehension (using parentheses instead of square brackets).
What would be the advantage of that, in this case? The list isn't likely to be particularly long, or anything. I don't mind doing it, I'd just like to learn why ;)
In relation to this I'd like to mention that your Python skills became waaaay better over the recent months. Nice!
Thanks! It's not hard when the starting point is 'non-existent' ;)
On Mon, 2014-12-01 at 00:22 -0800, Adam Williamson wrote:
On Mon, 2014-12-01 at 09:12 +0100, Vratislav Podzimek wrote:
This also debug logs the result of the 'disks with potential stage1 targets' check.
I like this improvement with two comments:
- using the word 'volume' and accordingly named variables makes me
think these are (only) LVM or BTRFS volumes which is not true. 'partition' is also problematic, because LVM could, in some cases, be used, but it's more typical to use partitions for stage1 devices. Or what about 'device' or 'dev'? That fits best I'd say.
I thought 'volume' was more or less the standard term anaconda uses in this case - you know, when the first word that comes into your head is 'partition', but it's not actually quite correct =). So that's why I used it.
Standard term anaconda uses for something? That's hard to believe. ;) I still think 'dev' would be best.
- see one neat pick below:
if platform.bootStage1ConstraintDict["format_types"]:
# sort the list of disks with those that contain possibly-valid
# stage1 partitions first, so we'll pick a disk with one as
# bootDrive if it has not been explicitly set: RHBZ #1168118
s1vols = [vol for vol in storage.devices if
getattr(vol.format, "type", None) in
platform.bootStage1ConstraintDict["format_types"]]
This, after having a better name, could be a generator instead of a list comprehension (using parentheses instead of square brackets).
What would be the advantage of that, in this case? The list isn't likely to be particularly long, or anything. I don't mind doing it, I'd just like to learn why ;)
The only difference is that there won't be a temporary list created (allocated, initialized,...) in the process. The expression is only used to pre-filter some list for some other expression. Such "pre-filters" are better done as generators instead of lists. It would make absolutely no real difference in this case, however, that's why I called it a "neat-pick".
In relation to this I'd like to mention that your Python skills became waaaay better over the recent months. Nice!
Thanks! It's not hard when the starting point is 'non-existent' ;)
That's hard to argue about. :)
----- Original Message -----
From: "Adam Williamson" awilliam@redhat.com To: anaconda-patches@lists.fedorahosted.org Sent: Friday, November 28, 2014 4:51:35 PM Subject: [f21-branch][PATCH] clean up stage1 volume check, check format as well as mountpoint
This is a follow-on from e2d10a04. To reduce the change from the previous behaviour and minimize the chance of any strange corner-case breakage, this sorts the disk list, rather than filtering it.
It also now checks the format_type constraint as well as the mountpoint constraint. There are platforms (PPC) which use a volume as the stage1 target, but don't require it to be mounted; this handles those as well as platforms which require the target volume to be mounted to a specific location.
It's also slightly re-arranged to flow better with the existing check that filters out protected, hidden and unusable disks.
The overall behaviour change can be described as: on platforms which require the stage1 target to be a volume, use the first enumerated disk which contains a potentially valid volume as the 'boot disk' if the user does not explicitly specify one, rather than just using the first enumerated disk and failing if it does not contain a valid stage1 target volume (#1168118).
This also debug logs the result of the 'disks with potential stage1 targets' check.
pyanaconda/kickstart.py | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/pyanaconda/kickstart.py b/pyanaconda/kickstart.py index d44b349..4aa4fd8 100644 --- a/pyanaconda/kickstart.py +++ b/pyanaconda/kickstart.py @@ -379,22 +379,30 @@ class Bootloader(commands.bootloader.F21_Bootloader): if self.timeout is not None: storage.bootloader.timeout = self.timeout
# try to reduce the set of possible 'boot disks' to the disks
containing
# a valid stage1 mount point, if there are any, because those disks
# should be searched for a stage1_valid device
disks = storage.disks
if platform.bootStage1ConstraintDict["mountpoints"]:
disks = [p.disk for p in storage.devices if getattr(p.format,
"mountpoint", None) in platform.bootStage1ConstraintDict["mountpoints"]]
if not disks:
# but if not, just go ahead and use the set of all disks or
# else we'll error out later
disks = storage.disks
# Throw out drives specified that don't exist or cannot be used (iSCSI # device on an s390 machine)
disk_names = [d.name for d in disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d,
blivet.devices.iScsiDiskDevice))]
disks = [d for d in storage.disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d,
blivet.devices.iScsiDiskDevice))]
if platform.bootStage1ConstraintDict["format_types"]:
# sort the list of disks with those that contain possibly-valid
# stage1 partitions first, so we'll pick a disk with one as
# bootDrive if it has not been explicitly set: RHBZ #1168118
s1vols = [vol for vol in storage.devices if
getattr(vol.format, "type", None) in
platform.bootStage1ConstraintDict["format_types"]]
if platform.bootStage1ConstraintDict["mountpoints"]:
# further sort by volumes with valid mount points, if
# the platform requires this
s1disks = [vol.disk for vol in s1vols if
getattr(vol.format, "mountpoint", None) in
platform.bootStage1ConstraintDict["mountpoints"]]
else:
s1disks = [vol.disk for vol in s1vols]
log.debug("Disks with potentially valid stage1 volumes: %s",
' '.join([d.name for d in s1disks]))
disks.sort(key=lambda x: x in s1disks, reverse=True)
AFAICT the whole chunk of code above, other than the assignment to disks is a no-op. The code goes to a lot of work to find this sldisks value, and then sorts disks using that value. But then, the sortedness of the disks matters not a whit, since the sorted names are just passed to set() which is unsorted by definition.
Do you mean filter, instead of sort, or is there another patch still un-applied which makes use of the sorted disks value in the execute() method?
disk_names = [d.name for d in disks] diskSet = set(disk_names) for drive in self.driveorder[:]:
-- 2.1.0
anaconda-patches mailing list anaconda-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/anaconda-patches
- mulhern
----- Original Message -----
From: "Anne Mulhern" amulhern@redhat.com To: "anaconda patch review" anaconda-patches@lists.fedorahosted.org Sent: Monday, December 1, 2014 9:51:54 AM Subject: Re: [f21-branch][PATCH] clean up stage1 volume check, check format as well as mountpoint
----- Original Message -----
From: "Adam Williamson" awilliam@redhat.com To: anaconda-patches@lists.fedorahosted.org Sent: Friday, November 28, 2014 4:51:35 PM Subject: [f21-branch][PATCH] clean up stage1 volume check, check format as well as mountpoint
This is a follow-on from e2d10a04. To reduce the change from the previous behaviour and minimize the chance of any strange corner-case breakage, this sorts the disk list, rather than filtering it.
It also now checks the format_type constraint as well as the mountpoint constraint. There are platforms (PPC) which use a volume as the stage1 target, but don't require it to be mounted; this handles those as well as platforms which require the target volume to be mounted to a specific location.
It's also slightly re-arranged to flow better with the existing check that filters out protected, hidden and unusable disks.
The overall behaviour change can be described as: on platforms which require the stage1 target to be a volume, use the first enumerated disk which contains a potentially valid volume as the 'boot disk' if the user does not explicitly specify one, rather than just using the first enumerated disk and failing if it does not contain a valid stage1 target volume (#1168118).
This also debug logs the result of the 'disks with potential stage1 targets' check.
pyanaconda/kickstart.py | 36 ++++++++++++++++++++++-------------- 1 file changed, 22 insertions(+), 14 deletions(-)
diff --git a/pyanaconda/kickstart.py b/pyanaconda/kickstart.py index d44b349..4aa4fd8 100644 --- a/pyanaconda/kickstart.py +++ b/pyanaconda/kickstart.py @@ -379,22 +379,30 @@ class Bootloader(commands.bootloader.F21_Bootloader): if self.timeout is not None: storage.bootloader.timeout = self.timeout
# try to reduce the set of possible 'boot disks' to the disks
containing
# a valid stage1 mount point, if there are any, because those
disks
# should be searched for a stage1_valid device
disks = storage.disks
if platform.bootStage1ConstraintDict["mountpoints"]:
disks = [p.disk for p in storage.devices if getattr(p.format,
"mountpoint", None) in platform.bootStage1ConstraintDict["mountpoints"]]
if not disks:
# but if not, just go ahead and use the set of all disks
or
# else we'll error out later
disks = storage.disks
# Throw out drives specified that don't exist or cannot be used (iSCSI # device on an s390 machine)
disk_names = [d.name for d in disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d,
blivet.devices.iScsiDiskDevice))]
disks = [d for d in storage.disks
if not d.format.hidden and not d.protected and
(not blivet.arch.isS390() or not isinstance(d,
blivet.devices.iScsiDiskDevice))]
if platform.bootStage1ConstraintDict["format_types"]:
# sort the list of disks with those that contain
possibly-valid
# stage1 partitions first, so we'll pick a disk with one as
# bootDrive if it has not been explicitly set: RHBZ #1168118
s1vols = [vol for vol in storage.devices if
getattr(vol.format, "type", None) in
platform.bootStage1ConstraintDict["format_types"]]
if platform.bootStage1ConstraintDict["mountpoints"]:
# further sort by volumes with valid mount points, if
# the platform requires this
s1disks = [vol.disk for vol in s1vols if
getattr(vol.format, "mountpoint", None) in
platform.bootStage1ConstraintDict["mountpoints"]]
else:
s1disks = [vol.disk for vol in s1vols]
log.debug("Disks with potentially valid stage1 volumes: %s",
' '.join([d.name for d in s1disks]))
disks.sort(key=lambda x: x in s1disks, reverse=True)
AFAICT the whole chunk of code above, other than the assignment to disks is a no-op. The code goes to a lot of work to find this sldisks value, and then sorts disks using that value. But then, the sortedness of the disks matters not a whit, since the sorted names are just passed to set() which is unsorted by definition.
Do you mean filter, instead of sort, or is there another patch still un-applied which makes use of the sorted disks value in the execute() method?
I do see that disk_names[0] is accessed later on in the execute method, so the sorting does make a difference. OTOH, the line of code that gets the value from disk_names by position was committed early in 2012, and was not influenced by the previous version of this patch.
Under the assumption that changing what is disk_names[0], i.e., which disk is assigned to self.bootDrive is a main purpose of this updated patch, I've got a few more comments, coming up.
disk_names = [d.name for d in disks] diskSet = set(disk_names) for drive in self.driveorder[:]:
-- 2.1.0
anaconda-patches mailing list anaconda-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/anaconda-patches
- mulhern
anaconda-patches mailing list anaconda-patches@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/anaconda-patches
- mulhern
On Mon, 2014-12-01 at 10:48 -0500, Anne Mulhern wrote:
I do see that disk_names[0] is accessed later on in the execute method, so the sorting does make a difference. OTOH, the line of code that gets the value from disk_names by position was committed early in 2012, and was not influenced by the previous version of this patch.
I'm pretty sure it was, I did test it quite carefully. The previous version defined 'disks' as 'all disks with valid stage1 target volumes' if there were any, or simply disks = storage.disks if not. Then it changed the disk_names list comprehension to:
disk_names = [d.name for d in disks...]
so that disk_names[0] would be a disk with a valid stage1 target volume, if there were any, or the same as before (first selected disk that isn't an iSCSI disk on s390, basically) if not. *Probably* there wouldn't have been any real-world case where the practical results of the two approaches differ, but the second one is clearly less theoretically likely to result in problematic behaviour.
On Mon, 2014-12-01 at 10:48 -0500, Anne Mulhern wrote:
Under the assumption that changing what is disk_names[0], i.e., which disk is assigned to self.bootDrive is a main purpose of this updated patch, I've got a few more comments, coming up.
I talked things over with Anne and she has a slightly re-arranged version of my change coming up soon, but it's more or less the same.
To clear up some questions that arose here and in blocker review:
1. The point of the patch is to change what self.bootDrive and storage.bootloader.stage1_disk get set to in execute().
2. With the current way anaconda works, execute() should rarely fail. It should only fail if there is no possibility of a valid stage1 target device ever being arrived at with the current set of chosen target disks. It can't fail just because we don't have one *right now*, because it often gets hit before partitioning is complete. In other words, to use the UEFI case as an example, it's expected to complete and to set storage.bootloader.stage1_disk to something even when no /boot/efi mount point has yet been created. This is arguably an odd design, but it's what we have, and it's obviously a bad idea to change it for F21 now. This is why the check implemented by the patch cannot be a filter that causes execute() to throw an exception if there is no potentially-valid stage1 target volume.
3. Anne pointed out that just reading the patch in isolation with the knowledge that this code is hit at least twice, and usually once before /boot/efi has been configured, it seems like it shouldn't work - because the first run will explicitly set self.bootDrive to the first disk, and the second run should then fail, because if self.bootDrive is explicitly set when it's executed, it's not supposed to consider other disks as possibly-valid choices, just fail if the explicitly-set self.bootDrive is not valid.
In practice, the patch really does work. What actually appears to happen is that at least in the interactive cases, something else run between the first and second calls to execute() tests, decides bootDrive isn't valid, and sets it to "", so on the second execute(), all drives are 'fair game' again.
We both suspect there may possibly be cases where bootDrive does not get reset to "" between calls to execute() that might be considered 'buggy', but we agree that this change doesn't seem like it can make things *worse* - only that it may not fix every possible 'buggy' case. One suspect case I'd like to try is a kickstart that specifies a /boot/efi on a disk other than the first but does not explicitly specify a 'boot disk', I'll try and test that today. Still, I'm not sure I'd want to try and fix that for F21, as it looks like more invasive changes might be needed to fix such a case (the obvious thing to try is to have execute() set a throwaway variable, rather than self.bootDrive , to disk_names[0], but I'd want to consider the impact of such a change quite carefully and test it out quite a lot, bootDrive is read and set in all sorts of places and changing something that sets it could very possibly have surprising consequences.
4. Dlehman wondered if my change means people with existing configurations have to 'be careful'. I think I can confidently say that in the UEFI-type case where the stage1 target is a *mounted* partition, the answer is 'no'. This patch is only going to change existing behaviour when it sees an EFI FAT-formatted volume with /boot/efi as its mount point. Any time someone has explicitly set a /boot/efi mount point, it seems inarguable that the intent of that action is "please use this as the stage1 target device", whether or not they actually went into the 'Full disk summary and bootloader' screen and nominated the disk on which they created /boot/efi as the 'boot disk'. Such users are definitely going to be happier if anaconda decides "hey, here's a /boot/efi, let's use it" than if it says "this /boot/efi is not on the first disk, so I'm going to declare that the configuration is invalid!"
I'm 99.99% sure there's no case where this change will suddenly lead to us using a different disk as the 'boot disk' just when someone uses guided partitioning or whatever.
It's slightly more questionable in the case where the stage1 target is a non-mounted partition (prepboot on PPC64, for instance). I don't know if I'll have time to test, but I think there may be a change in behaviour in a scenario where you install to an existing (e.g.) PPC64 system:
* with two disks * which has a prepboot partition on the second disk but not the first * if you pick both disks as installation targets * if you don't explicitly set a bootloader location
If you do all that with guided partitioning, I believe the patch would result in us using the prepboot on the second disk as the stage1 target; currently I suspect we'd wind up creating a prepboot partition on the first disk and using it as the stage1 target. If you do all that with custom partitioning, with the patch we'd pick the prepboot partition on the second disk as the boot target, without the patch we'd fail.
I *guess* you could describe the guided partitioning case there as an unexpected change. I can test and see if it's actually what happens, but it'll take an hour or two.
If we wanted to be super-conservative, we can quite easily adjust the patch to only apply to the partition-with-mount-point case.
Newest version of patch attached.
Basic changes from previous are:
* Don't bother to sorts disk names before passing to set(). * Only calculate s1disks if needed for designating bootDrive. * Don't sort the disks since we only need a best. * Treat constraints as if they are orthogonal, in case sometime it turns out that they are.
Assumes that disks has at least one element; so did the previous code.
- mulhern
----- Original Message ----- From: "Adam Williamson" awilliam@redhat.com To: "anaconda patch review" anaconda-patches@lists.fedorahosted.org Sent: Monday, December 1, 2014 2:11:29 PM Subject: Re: [f21-branch][PATCH] clean up stage1 volume check, check format as well as mountpoint
On Mon, 2014-12-01 at 10:48 -0500, Anne Mulhern wrote:
Under the assumption that changing what is disk_names[0], i.e., which disk is assigned to self.bootDrive is a main purpose of this updated patch, I've got a few more comments, coming up.
I talked things over with Anne and she has a slightly re-arranged version of my change coming up soon, but it's more or less the same.
To clear up some questions that arose here and in blocker review:
1. The point of the patch is to change what self.bootDrive and storage.bootloader.stage1_disk get set to in execute().
2. With the current way anaconda works, execute() should rarely fail. It should only fail if there is no possibility of a valid stage1 target device ever being arrived at with the current set of chosen target disks. It can't fail just because we don't have one *right now*, because it often gets hit before partitioning is complete. In other words, to use the UEFI case as an example, it's expected to complete and to set storage.bootloader.stage1_disk to something even when no /boot/efi mount point has yet been created. This is arguably an odd design, but it's what we have, and it's obviously a bad idea to change it for F21 now. This is why the check implemented by the patch cannot be a filter that causes execute() to throw an exception if there is no potentially-valid stage1 target volume.
3. Anne pointed out that just reading the patch in isolation with the knowledge that this code is hit at least twice, and usually once before /boot/efi has been configured, it seems like it shouldn't work - because the first run will explicitly set self.bootDrive to the first disk, and the second run should then fail, because if self.bootDrive is explicitly set when it's executed, it's not supposed to consider other disks as possibly-valid choices, just fail if the explicitly-set self.bootDrive is not valid.
In practice, the patch really does work. What actually appears to happen is that at least in the interactive cases, something else run between the first and second calls to execute() tests, decides bootDrive isn't valid, and sets it to "", so on the second execute(), all drives are 'fair game' again.
We both suspect there may possibly be cases where bootDrive does not get reset to "" between calls to execute() that might be considered 'buggy', but we agree that this change doesn't seem like it can make things *worse* - only that it may not fix every possible 'buggy' case. One suspect case I'd like to try is a kickstart that specifies a /boot/efi on a disk other than the first but does not explicitly specify a 'boot disk', I'll try and test that today. Still, I'm not sure I'd want to try and fix that for F21, as it looks like more invasive changes might be needed to fix such a case (the obvious thing to try is to have execute() set a throwaway variable, rather than self.bootDrive , to disk_names[0], but I'd want to consider the impact of such a change quite carefully and test it out quite a lot, bootDrive is read and set in all sorts of places and changing something that sets it could very possibly have surprising consequences.
4. Dlehman wondered if my change means people with existing configurations have to 'be careful'. I think I can confidently say that in the UEFI-type case where the stage1 target is a *mounted* partition, the answer is 'no'. This patch is only going to change existing behaviour when it sees an EFI FAT-formatted volume with /boot/efi as its mount point. Any time someone has explicitly set a /boot/efi mount point, it seems inarguable that the intent of that action is "please use this as the stage1 target device", whether or not they actually went into the 'Full disk summary and bootloader' screen and nominated the disk on which they created /boot/efi as the 'boot disk'. Such users are definitely going to be happier if anaconda decides "hey, here's a /boot/efi, let's use it" than if it says "this /boot/efi is not on the first disk, so I'm going to declare that the configuration is invalid!"
I'm 99.99% sure there's no case where this change will suddenly lead to us using a different disk as the 'boot disk' just when someone uses guided partitioning or whatever.
It's slightly more questionable in the case where the stage1 target is a non-mounted partition (prepboot on PPC64, for instance). I don't know if I'll have time to test, but I think there may be a change in behaviour in a scenario where you install to an existing (e.g.) PPC64 system:
* with two disks * which has a prepboot partition on the second disk but not the first * if you pick both disks as installation targets * if you don't explicitly set a bootloader location
If you do all that with guided partitioning, I believe the patch would result in us using the prepboot on the second disk as the stage1 target; currently I suspect we'd wind up creating a prepboot partition on the first disk and using it as the stage1 target. If you do all that with custom partitioning, with the patch we'd pick the prepboot partition on the second disk as the boot target, without the patch we'd fail.
I *guess* you could describe the guided partitioning case there as an unexpected change. I can test and see if it's actually what happens, but it'll take an hour or two.
If we wanted to be super-conservative, we can quite easily adjust the patch to only apply to the partition-with-mount-point case.
On Mon, 2014-12-01 at 15:29 -0500, Anne Mulhern wrote:
Newest version of patch attached.
Basic changes from previous are:
- Don't bother to sorts disk names before passing to set().
- Only calculate s1disks if needed for designating bootDrive.
- Don't sort the disks since we only need a best.
- Treat constraints as if they are orthogonal, in case sometime it turns out
that they are.
Assumes that disks has at least one element; so did the previous code.
AFAICT the practical effect of this is the same as my v2 (and, in any real-world case, the same as v1, unless someone can find a way to make the stage1 target mount point be on a hidden or protected disk). I'm obviously not really qualified to evaluate Anne's refinements/adjustments to my implementation, but I can certainly say I have no objections to any of it that I can see.
On Mon, 2014-12-01 at 15:29 -0500, Anne Mulhern wrote:
Newest version of patch attached.
Basic changes from previous are:
- Don't bother to sorts disk names before passing to set().
- Only calculate s1disks if needed for designating bootDrive.
- Don't sort the disks since we only need a best.
- Treat constraints as if they are orthogonal, in case sometime it turns out
that they are.
Assumes that disks has at least one element; so did the previous code.
So the F21 ship has sailed now, just wanted to follow up on the status of the changes that originated from me as regards F22.
I know dlehman had some concerns, but I do still believe this is the right thing to do *at least* for the mount point constraint case, for master. That one seems clear cut to me. The logic is: if we hit Bootloader.execute without self.bootDrive explicitly set, pick a disk that has a valid stage1 mountpoint on it as bootDrive if possible.
I just can't see anything wrong with that, regardless of what else we change with bootloaders. We're already choosing a bootDrive ourselves in this case, we're just doing it badly. I can't see any way in which this logic for doing it is not an improvement. If the user has set a valid stage1 target mountpoint for a partition, it seems inarguably the most logical assumption that what they want us to do is use it as the stage1 target. This isn't the perfect fix to all bootloader issues ever, but I really think it's a sensible improvement for a significant case.
The case where there's a format constraint but not a mountpoint constraint - i.e. the stage1 target is required to be a partition of a particular type, but not with a particular mount point - is more arguable, I think. In some cases on that path we definitely don't have any clear expectation on the user's part, and we're basically guessing, and changing the logic of how we guess may be a surprise to someone. Maybe they've got used to us creating a new boot partition on the first selected disk and using it as the stage1 target, in that case (which is, I think, what we do on the guided path, there). If we started preferring a disk with an existing boot partition, that might be an unexpected surprise.
For the record I did go and look at what the official documentation claims we do, to see if it matches what we actually do, and I'm pretty sure it's just wrong. It claims we install the bootloader on the same disk as the root partition if the user doesn't specify a boot target disk:
====
https://docs.fedoraproject.org/en-US/Fedora/20/html/Installation_Guide/s1-x8...
"The installation program installs GRUB in the master boot record, or MBR, of the device for the root file system."
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm...
"The installation program installs GRUB2 either in the master boot record (MBR) or the GUID partition table (GPT) of the device for the root file system."
====
That's wrong, we don't do that (maybe we used to?) If you do an install to two disks, and put all the Fedora partitions on the second, we'll (try and) write the bootloader to the first disk if you don't explicitly choose the second. So, with the patch we would not conform to the documented behaviour, but then we already don't. =)
On Wed, 2014-12-03 at 16:54 -0800, Adam Williamson wrote:
On Mon, 2014-12-01 at 15:29 -0500, Anne Mulhern wrote:
Newest version of patch attached.
Basic changes from previous are:
- Don't bother to sorts disk names before passing to set().
- Only calculate s1disks if needed for designating bootDrive.
- Don't sort the disks since we only need a best.
- Treat constraints as if they are orthogonal, in case sometime it
turns out that they are.
Assumes that disks has at least one element; so did the previous code.
So the F21 ship has sailed now, just wanted to follow up on the status of the changes that originated from me as regards F22.
So...ping?
I do recall I spent a rather sleepless weekend trying to get this fixed for F22, and so I'm a bit sad to see it hasn't even gotten formally reviewed for F23 yet.
I do recall dlehman had some questions about it on IRC, but no formal review, and AFAICT he hasn't implemented something else (as he seemed to suggest he was going to do). So I think this is probably just as bad in F23 as it was in F22 (I haven't checked yet, but I will).
I really really think that *at least* the part of this that applies to *mounted* stage1 target partitions is a clear improvement on current code and should be implemented, if no-one's going to implement anything else. The user experience of this issue is not good.
I'd be OK with hanging back on the part that applies to *umounted* stage1 target partitions on the basis that it's less safe to fiddle with the guessing approach there. But it'd be really nice to at least get a proper mailing list review of this?
anaconda-patches@lists.fedorahosted.org