On Mon, Jan 17, 2022 at 2:00 PM stan via users <users@lists.fedoraproject.org> wrote:
On Sun, 16 Jan 2022 21:07:29 -0500
Fulko Hew <fulko.hew@gmail.com> wrote:

> I decided to log out and log back in to my X11 based KDE session just
> now, and I saw that 'Discover' was telling me I had updates available.
> So I said 'go ahead'.
> Eventually, it said I needed to reboot, so I did.
> After 4 (or 5) reboots that the machine drove itself through,
> the last reboot failed to start up to a GUI session.

Suspicious that this is not deterministic.  It should either fail
identically every time or restart every time (in my opinion).

I don't think you understand what I was saying.
The update process that 'Discover' performed was 'strange' to me.
For the last 20 years, I've used rpm, yum and dnf to download and
install updates, and if I wanted... I'd reboot to use any new kernel
that may have been updated.

This time I chose to use 'Discover', because (for a change) it actually
told me there was new stuff.  (I'm getting the feeling that 'discover' only runs at
login time.  Something I do only once every few months.  ie. at every power failure.)

So after 'discover' downloaded and (apparently) updated everything, it asked me
to reboot.  So I used discover's reboot button to proceed.  During the first
reboot cycle I watched the boot messages go by, and I saw words to the effect
that it was doing some post reboot additional updates.  It finished them and then
said it was rebooting.

On that next boot, I watched again, while it talked about other updates it needed to do,
and... and another reboot.

After the n'th reboot, I no longer saw any 'installing' activity, and it went all the
way through and then ... nothing.  No more boot messages, and no GUI either.

So I DID do a cold reboot and then it went through the standard boot messages
until those errors I mentioned and it dropped me into that emergency boot prompt.


> As a matter of fact, it dropped me down and told me it needed to
> enter an emergency boot
> and asked for my root password.
> The message also told me to look at 'journalctl -xb'
> After a few thousand lines of info, I saw nothing of significance
> other than it hadn't finished.

You could try
journalctl -rxb
so that the last messages are presented first.  It is likely that that
is where the error will be.

journalctl -xb gave me those error messages I provided.
So yes, the first error was that it couldn't mount /boot/efi.


> The other suggestion was 'systemctl default'.
> That resulted in the following:
>
> Failed to mount /boot/efi
> Dependency failed for Local File System
> Dependency failed for Mark the need to relabel after reboot
> Failed to mount RPC File System
> Dependency failed for rpc-pipefs.target
> Dependency failed for RPC security service for NFS client and server
> Failed to start Load Kernel Modules
> Failed to mount Arbitrary Executable File Formats File System
> Failed to mount Arbitrary Executable File Formats File System
> Failed to mount Arbitrary Executable File Formats File System
> Failed to mount Arbitrary Executable File Formats File System
> Failed to mount Arbitrary Executable File Formats File System
> Failed to start Set Up Additional Binary Formats
>
> ... and then nothing.  I had to cold start

It seems like a hardware error to me from the symptoms.  How can the
kernel not mount /boot/efi unless the drive has either power issues or
seek errors / bad sectors.  This is really basic.


I read other postings that people have had issues with missing VFAT
support in their kernel, that's needed to mount that filesystem.

> and that brings me back to the same issues.
> Trying to reboot a previous kernel doesn't even result in any boot
> messages.
>
> I now have a 'non-working' machine.
> Suggestions are welcome (and needed)!

Long shots.

It might be software, but it could be just a coincidence that it chose
this time for a hardware error to expose itself.

Do you have a list of what was updated?  It would be good to see if
there are any updates that might have caused this to happen via
software.  I'm not sure what would stop /boot/efi from being mounted.


Sadly I don't have that list.  There were about 60 components
including the kernel that were updated.

Did you power down completely at any point?  That will allow components
to lose any retained state.

You could, while completely powered down, try reseating internal
components, especially drive connectors.

If you can reach the BIOS menu, look at the power supply numbers.  Are
they at or near spec?

Can you boot a livecd / usb so you can do checks of the drives to see
if they are still functioning properly, maybe a smartctl (smartctl -a
/dev/[drive designation])? If a live image boots and runs, it will
indicate that your memory is (probably) not the issue as well.


After a lot of experimentation, I did get the previous kernel to boot
all the way to the GUI.  (I don't know why that didn't work the first
time I tried it.)  So I'm back to a working system.
My hardware is fine.
And that older kernel (5.15-13-200.fc35) IS able to mount /boot/efi
It's just the newer kernel that can't.

What do I see now?

1/ I see that about 30 of those 60 packages that were supposed to be originally
   installed never were.  Mostly wine stuff.  I installed them manually with dnf.

2/ I think I'd like to uninstall those latest kernel packages. (5.15.14-200.fc35)
   kernel, kernel-core, kernel-devel, kernel-modules, kernel-modules-extra
   and then re-install them.
   I'm not confident yet on what that actual command line would be, so I
   haven't done it yet.

3/ I don't think I'll ever use 'discover' again.
   It seems tedious, doesn't provide any status feedback on what it's doing.
   And it always seems to want to reboot.
   What was wrong with the old 'new rpm download/install' procedure/utility?