Thanks Richard.  Yes, I talked with Titan; they suggested trying the pcie-m.2 adapter.  I will try them again.
I have not checked for bios updates.  Not sure how to go about that (last time I did that it required an msdos floppy disc).

Haven't tried the SSDs in another device because I don't have one.  But the fact that replacing the SSD causes it to work, where it wasn't working before, tells me they were damaged.  I have at least once power off/on the workstation, and the bios did not find any ssd to boot from.  So power cycle didn't fix it, but replace ssd did fix it.

I will try Titan again later today, but just looking for ideas.

Thanks,
Neal

On Tue, Feb 22, 2022 at 8:44 AM Richard Shaw <hobbes1069@gmail.com> wrote:
On Tue, Feb 22, 2022 at 7:34 AM Neal Becker <ndbecker2@gmail.com> wrote:
I know this is a bit OT, but you guys are great at answering all questions.

I bought a workstation from Titan computers around 1/2020 (dual EPYC cpu).  After about 1 year it stopped working.  I could ssh to it, and almost any command would return Input/Output error.  Unfortunately journalctl gave input/output error so I can't see logs.  cat /proc/partitions did not show any nvme device (the root device) on which the OS was installed.

I replaced the SSD with a samsung 980 pro.  Reinstalled fedora.  It then worked a few weeks, then the exact same symptoms.

I replaced the SSD with another samsung 980 pro, this time with heatsink.  Reinstalled fedora.  It worked a few weeks.  Then same symptoms.

Then I replaced with a 4th samsung 980 pro, but this time instead of using the M.2 socket I used a pcie-m.2 adapter (in case something was wrong with the m.2 socket).  Also added a surge protector outlet for good measure. Reinstalled.  Watched the smartctl.  No errors.  Temperature was always low.

Now it's failed again, exactly same symptoms.

Any ideas?

I remember your other email about a month or so ago and thought it was really strange. Have you tried the drives in another system to confirm they're truly dead? 

I would check for BIOS updates just for good measure. Other than that, have you had any communication with Titan about it?

Thanks,
Richard
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


--
Those who don't understand recursion are doomed to repeat it