"dnf upgrade" crashing machine - users - Fedora Mailing-Lists

Tuesday, 28 April 2020

Fedora 31, fully upgraded (I think...)  Summary: twice in the last
three days (since getting kernel 5.6.6, though I have no idea if that's
related), this machine has hung completely after running "sudo dnf
upgrade" while logged in over ssh. It becomes completely unresponsive to
the network and to the keyboard, mouse, etc., I go to it to reboot. I've
had to use the power button. The only strange things I see in the logs are
long strings of null characters (that show as ^@) in the dnf.log when the
dnf transactions crash the machine. 

Here are more details. This is a Dell Precision T1700 that is my university
office workstation as well as a IMAP server and web server for my use.
Since we're working remotely and discouraged from going on campus, I
hadn't been to the office since March 16, but in addition to reading and
sending mail, I've been putting stuff on my course web pages, etc., as
well as doing regular "sudo dnf upgrade" runs.

I got kernel 5.6.6 on Saturday, April 25.  On Sunday, I did a dnf upgrade
on my home machine, which is almost identically configured to the one in
my office.  That upgraded some git stuff, python3, samba, webkit2gtk3,
etc., a total of 24 packages.  That went fine, so I logged into my office
machine and did dnf upgrade there.  It offered pretty much the same list
of packages, I say "y" and did work in another virtual desktop for a
while.  Then I got an error message saying claws-mail couldn't connect to
the account on my office machine.  It was dead to the network--I logged in
on another machine in our department network and still couldn't connect to
my office one.  Eventually, I went to campus.  The machine was
unresponsive and I had to push the power button to get it to fully shut
down.  It then rebooted apparently normally, and ran fine.   I didn't have
time then to investigate very carefully, but I didn't see anything in the
journalctl output that indicated what the problem was--there was just a
complete gap in entries from the dnf transaction to when I rebooted.

The next day (yesterday) dnf upgrade worked normally and upgraded 6
packages: darktable, libmwaw, libstaroffice, libwps, openvpn, and
python3-click-7.1.1-1.fc31.noarch.

Today, I did an upgrade on the home machine again, successfully, and then
on the office machine.  It offered to upgrade akmods-wireguard, which I
realized I should have removed (since wireguard  is in the 5.6 kernel).
So I removed it and then did the upgrade.  dnf wanted to upgrade
google-chrome-stable, kde-print-manager and kde-print-manager-libs,
libappindicator, libappindicator-gtk3, libuv, some net-snmp stuff, and
python[23]-beautifulsoup4, all of which had been upgraded successfully on
the home machine. I said yes and it did some deltarpm stuff, downloaded a
couple of things, and then apparently froze completely, just like the last
time.  

Again, I had to go in to the office and use the power button to shutdown
and restart. But now it seems ok and the rpms that it was supposedly
upgrading do seem to have been upgraded--they're at the same versions as
on my home machine.

I don't see anything very strange in any of the logs.  Both times, the
basic journalctl output just stops and doesn't start again until I reboot.
However, in the dnf.log file after both of the bad upgrade transactions
started, there are long strings of null characters (150 or so).  These
don't show up in the dnf.log on the home machine that has upgraded fine.

I've run fsck and rpm -V.  There's plenty of disk space available in all
partitions.  

I don't have any idea what's going on and it's very inconvenient (not to
mention strongly discouraged by the powers that be) to have to keep going
on campus to restart the machine.  So I'd be very grateful for suggests
about how to figure this out, or at least stop it from happening again.

Thanks.

  George