On Tue, Apr 07, 2009 at 09:35:15AM -0700, Peter J. Stieber wrote:
This is going to sound vague, but here goes...
I have a dual opteron system that has been acting as the worldly node
for a small cluster of computers since September, 2004. The machine is
running the latest x86_64 Fedora 10 kernel that I recently loaded (April
2). The machine reboots without warning. I can't find the cause in log
files (maybe I'm not looking in the correct log).
I'm currently running memtest. If all of the tests pass, could the
community suggest other diagnostic tasks or information I could post to
help diagnose the problem?
Have you tried going back to the previous kernel? Did you check dmesg and
/var/log/messages? Does it boot normally and then just fail at some random interval or is
it consistently failing at the same point?
Other things you may consider: CPU type? temperature? potential hard drive issue? any new
hardware attached or installed recently? Notice any power surges or brownouts? any other
nodes having issues?
Recent power surge zapped a board, DSL modem, and the surge protector. Come on newegg....
--
"Any fool can know. The point is to understand" --Albert Einstein
Bored??
http://fiction.wikia.com/wiki/Fuqwit1.0
http://fiction.wikia.com/wiki/Coding_the_Magic_into_the_Eight_Ball