Kernel Panic panic

Giles Orr gilesorr-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Wed Nov 23 16:33:33 UTC 2011


On 8 November 2011 12:04, Lennart Sorensen <lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org> wrote:
> On Tue, Nov 08, 2011 at 11:47:41AM -0500, Peter King wrote:
>> After upgrading to 3.1.0, one of my computers began randomly crashing, usually
>> after a day or two. The first time it happened I jotted down the start of the
>> message:
>>
>>   Uhhuh. NMI received for unknown reason 2d on CPU0.
>>   Do you have a strange powersaving mode enabled?
>>   Dazed and confused, but trying to continue
>>   Process swapper...
>>   <snip>
>>   Kernel Panic -not syncing: softlockup: hung tasks
>>   PID:0, comm: swapper Not tainted 3.1.0-gentoo #1
>>
>> The other crashes have uniformly mentioned swapper and softlockup, but not NMI.
>> My understanding, after googling around, is that this error message is not uncommon
>> and not informative -- it says that swapper (pid 0) has quit doing its job, which,
>> since it runs everything, doesn't say much about where the particular problem
>> could lie. The other crashes also identify swapper as the hung process.
>>
>> I reverted to 3.0.6, to see whether it was a kernel problem, but now the computer
>> has crashed again. Perhaps a problem with the 3.x kernels?
>>
>> Next time I get physically near it I'll run an extended version of memtest, to see
>> whether this is caused by flaky memory. A few months ago this computer was too
>> touchy to accept either new gigabyte ethernet cards or new RAM. Perhaps the hardware
>> is failing.
>>
>> Any ideas for further diagnostics? The log files are completely uninformative. The
>> filesystems are not even close to being full, and, up until recently, it seemed to
>> be running reliably. Thanks for any suggestions.
>
> It seems to be an issue that has hit various people with 3.1, 3.0, 2.6.39,
> etc, back to around 2.6.37.  Something to do with an NMI watchdog rewrite
> and some bad interaction with suspend/powermanagement.
>
> Seems it is still being worked on, since no one quite has figured out what happens.

Sorry to revive an old thread, but thought my experience might be
relevant to someone else.

After this comment from Lennart and others in this thread, I switched
from kernel 3.0.0 to kernel 2.6.32 and now suspend works just fine.  I
had guessed it was a kernel problem prior to this and changed kernels
... but I didn't go back far enough (I used 2.6.38).  Essentially,
suspend-to-ram would work 10 to 15 times.  Then on the final suspend,
it would fail to wake up - or worse, crash horribly.  When I say
"horribly," what happened was that the machine would immediately
attempt to wake itself and then sleep again, and would power on and
off every four seconds until I physically pulled the plug (even the
power button long press and reset buttons didn't work).  So yes, I was
looking for a fix.

What's frustrating about this is that the only reason I eventually
found out about this bug was from reading GTALUG.  It seems like a
major bug that must have caused problems for a LOT of users, and yet
it seems relatively unknown despite its longevity.

At least I saw mention recently that a fix seems finally to have been found?

I hope this is helpful to someone else ...

-- 
Giles
http://www.gilesorr.com/
gilesorr-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list