
James Harper <James.Harper@bendigoit.com.au> writes:
Crash is a full lock up, display is frozen yet coherent, no response from kbd or mouse. The caps lock light does not flash nor can it be toggled, and SysRq combos do not work. Network is also unavailable.
I assume "network is also unavailable" means you tried pinging the host and got no response. Unlikely to be useful, but exfiltrate dmesg using netconsole might show you something useful in its last moments.
If it was a memory error then you'd expect the occasional hard lockup but more often random segfaults etc, so I'm thinking overheating, CPU, system board, or power supply.
+1; I had problems like this with switched mode PSUs in a 1RU -- and swapping in a replacement unit didn't help, because it was (probably) crap power input from mains. Either swapping to a conventional PSU or putting it behind a UPS fixed it -- I can't remember which I did first.