Problem: my desktop system hangs. Question: what to do next

D. Hugh Redelmeier hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Wed Jul 25 19:54:14 UTC 2012


| From: Lennart Sorensen <lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org>

| On Wed, Jul 25, 2012 at 12:47:40PM -0400, D. Hugh Redelmeier wrote:

I forgot to mention: many failures are in connections.  Just
unplugging and reconnecting (or even better: permuting) every
connection sometimes fixes things.  Or sometimes it makes things
worse, which is also good if you are trying to debug.

For example: if you have two DIMMs, swap them.

| > | From: Mel Wilson <mwilson-Ja3L+HSX0kI at public.gmane.org>

| > It is easy and cheap to run memtest86+ overnight and see what it sees.
| > Memory is the problem more often than one would think.  If it isn't the
| > problem, it is nice to know that too.  Memtest86+ seems to be a boot
| > option on my Ubuntu 10.04 system, so you don't have to look far to
| > find it.
| 
| Would be nice to know, but it can't tell you that.  memtest can only
| tell you that your ram is bad.  It can not tell you that it isn't bad
| since you might just not be testing in the manner that shows the failure.

You are right.  Every debugging technique can show the presence of
bugs but not their absence.  But I've had a pretty good experience
with memtest86+, run for many hours.  Rarely, but sometimes, bugs have
shown up hours in (if I recall correctly).

| > It is *really* handy to have spares to swap in and out to see if it
| > makes a difference
| 
| Certainly the most effective method in general.

Do remember that a swap also exercises the connectors.  So if a swap
changes behaviour, it does not prove that the component is the
problem.

| I have had memory sticks fail and I have had power supplies fail
| (although in my situation they usually fail completely, but I have seen
| cases where they were just not working right).

I have had both subtle memory failure and subtle power supply failure.

| > Can you do the problematic build, but switch (before the lockup) to
| > the old-fashioned text console?  There is a chance that kernel oopses
| > or panics might show up there (but be hidden by a normal X server).
| > There are also SysReq key combinations that do kernel diagnostic
| > things, best done from a text console.  See, for example,
| > <http://www.debian-administration.org/article/457/The_magic_sysreq_options_introduced>
| 
| Serial consoles are handy so you can capture on another machine anything
| the kernel prints.

Serial console is useful but:

- it slows down things because the kernel blocks (at least some
  things) while the message is being transmitted.

- it only works on conventional serial ports (a dying capability) and
  expansion boards or USB to serial dongles don't count.  A P4 system
  is old enough to have a traditional serial port.

- it probably isn't any better than an old-fashioned text console if
  the problem is a freeze.  In a freeze, any messages are likely
  preserved.  A serial console is better if subsequent events (eg. a
  spontaneous reboot) would wipe the display.

- a serial console is a bit arcane to set up and requires another
  device (eg. an ASR-33 :-)) to be at the other end of the serial
  cable.

| I am upgrading my mythtv box tonight since the hardware in it has been
| crashing way too often.  I suspect ram/cpu/motherboard problem.  So it
| is going from a Q6600+Asus P5K(-R)+4GB to a i7-3860+X79 sabertooh+32GB.
| I hope it stops crashing (and gets a wee bit faster too).  My desktop
| can receive the old parts to replace the athlon 2800.  It is allowed to
| be a bit unstable if I can't figure out the cause.

Gosh, you do use serious hardware for Myth.  Mine's the reverse: my
desktop is a Q6600 and I have a Myth backend on an Athlon 1700.

All other things being equal, I would like a Myth box that takes less
energy: it is on all the time.  Besides, fan noise intrudes on the TV
experience.

Having said that, the Athlon box's power supply went south, I replaced
it (as I mentioned in this list), and now have a much noisier system.

Hey: the Athlon 2800 could be handed off to Mel :-)
(I'm recollecting that you don't hoard useless PCs the way I do.)

| Doing swapping of DDR2 ram, socket 775 cpu or a mainboard isn't really
| an option since I don't have any others of any of those.

Since you surely have multiple DIMMs, you can at least permute them.
Or even run for a while with a half-complement.

| These things are complicated to figure out after all.

Yeah.
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list