I know this is a bit OT, but you guys are great at answering all questions.
I bought a workstation from Titan computers around 1/2020 (dual EPYC cpu).
After about 1 year it stopped working. I could ssh to it, and almost any
command would return Input/Output error. Unfortunately journalctl gave
input/output error so I can't see logs. cat /proc/partitions did not show
any nvme device (the root device) on which the OS was installed.
I replaced the SSD with a samsung 980 pro. Reinstalled fedora. It then
worked a few weeks, then the exact same symptoms.
I replaced the SSD with another samsung 980 pro, this time with heatsink.
Reinstalled fedora. It worked a few weeks. Then same symptoms.
Then I replaced with a 4th samsung 980 pro, but this time instead of using
the M.2 socket I used a pcie-m.2 adapter (in case something was wrong with
the m.2 socket). Also added a surge protector outlet for good measure.
Reinstalled. Watched the smartctl. No errors. Temperature was always low.
Now it's failed again, exactly same symptoms.
Any ideas?
Thanks,
Neal