On Wed, 23 Feb 2022 at 12:20, Neal Becker <ndbecker2@gmail.com> wrote:
OK, status update.

1. The workstation is located in a offices of a large satellite ISP.  Clean power is probably not an issue.

2. Attempt to reboot machine.  BIOS boot options does not show the M.2 SSD existing.

3. Boot F35 from USB live.  Go to install to disk.  SSD is not shown as an option for installation.  (I didn't try lsblk, but I'm sure it would have shown the SSD didn't exist).

4. Install SATA SSD drive.  That was fun, I didn't know how to get to the drive bays and didn't have screws to mount it, so used double backed sticky tape.

5. Again BIOS boot options.  Oh look, F35 samsung pro 980 is back!  My best guess is that some part has a thermal issue and while I installed the SATA drive it had cooled down??  Doesn't really explain that the same symptoms occurred both with a SSD plugged into the MB M.2 socket and when I got a pcie-m.2 adapter and plugged it in there, since there would have been 2 different controller chips (I guess).

6. Anyway I perform install to sata drive.  Everything is fine (for now).  

7. The m.2 ssd that didn't exist previously is still plugged.  I can mount it; everything seems fine.  Run smartmonctl -a /dev/nvme0.  No errors recorded.  No problems.  100% spare.

I'm still baffled.

Some articles mention drives talking very long times to respond while running internal "grooming" -- I presume moving data off areas with high "wear" to areas of lower wear.
There are also reports that frantically moving the drive around to different external adapters eventually allows drive to mount.  All could be explained by drive taking too long to 
respond while grooming is in progress.   Once grooming is finished, drive works normally.
 


On Wed, Feb 23, 2022 at 9:42 AM George N. White III <gnwiii@gmail.com> wrote:
On Wed, 23 Feb 2022 at 06:49, Tim via users <users@lists.fedoraproject.org> wrote:
On Tue, 2022-02-22 at 12:21 -0500, Go Canes wrote:
> If it was just erasing the partition table the drive would still be
> visible using lsblk, and you could re-partition it with fdisk, etc.

I do wonder if the devices were ruined, or just had their data
scrambled?  Neal didn't say whether he'd tried reformatting the failed
ones, I presume he would have, but he just mentioned replacing them.


Various magical incantations do sometimes seem to work:
 


 


> You mentioned a surge suppressor strip - any chance it has already
> suppressed a surge in the past?  If so, it might not be functioning
> as a surge suppressor anymore.

Every day your house receives lots of surges.  Most you'll never
notice.  There's not just the times you notice the lights suddenly
glowing brighter.  There's very fast and large spikes from the
continual switching of loads across the grid, and how the stations
manage them.  Because of this, surge protectors are always working, and
will eventually die without you probably being aware of it.

I used to add surge protection to power bars.  We had a tree fall on the 
cable coax that they had installed without a proper anchor, just a zip tie
to the mast for the AC power.  The mast was pulled off the house, which
meant the neutral line got disconnected first, and lightbulbs went off like 
flash bulbs.  My Victor 9000 PC with the home-made surge protection 
survived, but we lost the doorbell transformer and a radio (and the 
thyristers in the surge protector).

My father in law also used home-made surge protectors.  A truck 
hit a high-voltage distribution line, causing the high voltage wires to 
fall onto the lower voltage feed to the house.   His computer survived, 
but some appliances died.
 

It's worth remembering that a surge protector may not protect your
equipment from being wrecked, primarily it should blow a house fuse on
a large surge to prevent some equipment catching fire and burning your
house down.

My power bar had a circuit breaker which did trip.
 

If you have noisy mains causing you a problem you want mains filtering,
possibly a UPS as well (a constantly running one, like a power
conditioner, not a changeover one that supplies raw mains until it
kicks in as a backup supply).

For what it's worth, considering my opening paragraph about the mains
always has spikes on it all day every day, that's normal.  Any
equipment that requires external protection has not been built
correctly.  Anything that plugs directly into the mains should be able
to handle what's normally on the mains, for its entire operational
life.

I grew up in an area with frequent power outages.   My parents were
careful to unplug computers when not in use.  We often put an overhand
knot in power cords.  One day I had just turned on an electric kettle when 
lightning hit near the house.  The kettle had a metal base with a hole where 
the cord passed thru and a 2-conductor cord.  The cord burned off at the 
hole -- I assume an induced current took a shortcut from one conductor to 
the other.   Nothing else was damaged.


The exception I make about that rule is when you want to minimise noise
on something that can handle it without damage, but the effect is
noticeably annoying and you want to reduce it.  But again, the
equipment really should have been designed better.  Stereo systems, for
instance, shouldn't crackle along with mains pops.

At work we had a small machine room with a window in the door.  One day 
during a storm I was walking past the room when lightning hit the building.
I saw a bright trail down the corner of the room and across one of the SGI
Octane workstations.  Those systems had a heavy metal chassis under a
plastic cover, but there was a light bar outside the chassis.  One of the 
incandescent bulbs in the light bar burned out, but that was the only damage.   

Better designed equipment costs more.   I generally try to buy gear that
has been on the market a couple years, which gives time for drivers to
make it into the kernel and for design flaws to be noticed.   Vendors often
reduce prices just before introducing new models.  I once scored a 
PowerEdge server with full complement of ECC memory for the price of 
the memory days before a newer model was announced.  

--
George N. White III

_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


--
Those who don't understand recursion are doomed to repeat it
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure


--
George N. White III