I've had a workstation (dual amd rome) for about 2 years.  The M2 ssd died after about 1 year.  I replaced it with a samsung 980 pro, which then lasted almost 1 more year.  Then I replaced it with a 1TB samsung 980 pro, this time with heat sink.  This lasted a few weeks.  I had been looking at smart nvme data and saw no problems, temp was fine.

Now it's dead again.  I can ssh to the machine.  I can cat /proc/partitions and no nvme shown.  I can only issue a couple of commands (I guess whatever is builtin to bash?), but almost all just give I/O error.  no sudo or journatctl.

The machine is only used for compute, not heavy I/O so not caused by ssd wear (and smartctl showed no wear at all).

Any ideas?

Thanks,
Neal

--
Those who don't understand recursion are doomed to repeat it