Where's the culprit?

Peter King peter.king.1-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Mon Sep 5 02:19:49 UTC 2011


On Sun, Sep 04, 2011 at 09:57:07PM -0400, Lennart Sorensen wrote:

> > Sep  4 15:44:00 theseus kernel: Pid: 16344, comm: sed Not tainted 3.0.3-gentoo #6 System Manufacturer System Name/A7V8X-X
> > Sep  4 15:44:00 theseus kernel: EIP: 0060:[<c10920fb>] EFLAGS: 00210286 CPU: 0
> > Sep  4 15:44:00 theseus kernel: EIP is at __destroy_inode+0x29/0x62
> > Sep  4 15:44:00 theseus kernel: EAX: ffffdfff EBX: db04bb88 ECX: 00000003 EDX: ffffdffe
> > Sep  4 15:44:00 theseus kernel: ESI: db04bb88 EDI: 00000000 EBP: db04bb88 ESP: ef6e7f44
> > Sep  4 15:44:00 theseus kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> > Sep  4 15:44:00 theseus kernel: Process sed (pid: 16344, ti=ef6e6000 task=f5465be0 task.ti=ef6e6000)
> > Sep  4 15:44:00 theseus kernel: Stack:
> > Sep  4 15:44:00 theseus kernel: db04bb88 c109214a db3dfb14 c108f389 db3dfb14 00000000 db04bb88 c10911a2
> > Sep  4 15:44:00 theseus kernel: f4e4e6c0 00000010 db04bb88 c10831e8 00000020 f4e4e6c8 db3dfb14 f5416620
> > Sep  4 15:44:00 theseus kernel: f4e4e6c0 00000000 f542d660 ef6e6000 c108091d f542d660 00000000 00000000
> > Sep  4 15:44:00 theseus kernel: Call Trace:
> > Sep  4 15:44:00 theseus kernel: [<c109214a>] ? destroy_inode+0x16/0x37
> > Sep  4 15:44:00 theseus kernel: [<c108f389>] ? d_kill+0x93/0xa5
> > Sep  4 15:44:00 theseus kernel: [<c10911a2>] ? dput+0xf7/0x100
> > Sep  4 15:44:00 theseus kernel: [<c10831e8>] ? fput+0x198/0x1b0
> > Sep  4 15:44:00 theseus kernel: [<c108091d>] ? filp_close+0x54/0x5a
> > Sep  4 15:44:00 theseus kernel: [<c108097b>] ? sys_close+0x58/0x85
> > Sep  4 15:44:00 theseus kernel: [<c13bb490>] ? sysenter_do_call+0x12/0x26
> 
> So this was the interesting bit.  Something called close on a file,
> which then caused an inode to be removed, and then the kernel blew up.
> 
> Would have been nice if it had said what pid 16344 was.  I thought it
> usually did.  Maybe the kernel is missing some config for that.

I take it, then, that it isn't sed (which is mentioned on the top line). It may well be that some
debugging configuration is absent

> > Sep  4 17:42:35 theseus kernel: BUG: unable to handle kernel NULL pointer dereference at   (null)
> > Sep  4 17:42:35 theseus kernel: IP: [<f564c0c0>] 0xf564c0bf
> > Sep  4 17:42:35 theseus kernel: *pde = 00000000 
> > Sep  4 17:42:35 theseus kernel: Oops: 0002 [#2] SMP 
> > Sep  4 17:42:35 theseus kernel: Modules linked in: e1000
> > Sep  4 17:42:35 theseus kernel: 
> > Sep  4 17:42:35 theseus kernel: Pid: 17180, comm: emerge Tainted: G      D     3.0.3-gentoo #6 System Manufacturer System Name/A7V8X-X
> > Sep  4 17:42:35 theseus kernel: EIP: 0060:[<f564c0c0>] EFLAGS: 00210256 CPU: 0
> > Sep  4 17:42:35 theseus kernel: EIP is at 0xf564c0c0
> > Sep  4 17:42:35 theseus kernel: EAX: 00000000 EBX: 0000013e ECX: f564c390 EDX: 00200286
> 
> This one strangely doesn't provide any useful info.

Again, I guess it isn't emerge, although I can attest that emerge was running at the
time.

> > ...and then silence. The machine has crashed and required a reboot at this
> > point. Looks to me like there is a problem with a "kernel paging request" in
> > SMP #1, leading to the first oops at 15:44. Then, amazingly, it chugs along
> > more or less well until 17:42, when there is a "kernel NULL pointer dereference"
> > in SMP #2, which brings the whole thing down.
> > 
> > I can't get a sense from that whether it's the modules linked in (namely e1000),
> > or syslog-ng, or something else. Anyone?
> 
> SMP simply means symetric multi procesing, which pretty much all systems
> are these days.

I know about SMP, but I don't know why these are referenced to apparently different
sources, since the processor is single-core. Perhaps because it's running make -j2.

Well, now I'm less certain than ever where the problem lies. There were a few notes
on kernel oopses and e1000, but mostly back in 2.4 days (although one in 2.6.26 IIRC).
Puzzling.

-- 
Peter King			 	peter.king-H217xnMUJC0sA/PxXw9srA at public.gmane.org
Department of Philosophy
170 St. George Street #521
The University of Toronto		    (416)-978-4951 ofc
Toronto, ON  M5R 2M8
       CANADA

http://individual.utoronto.ca/pking/

=========================================================================
GPG keyID 0x7587EC42 (2B14 A355 46BC 2A16 D0BC  36F5 1FE6 D32A 7587 EC42)
gpg --keyserver pgp.mit.edu --recv-keys 7587EC42
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://gtalug.org/pipermail/legacy/attachments/20110904/68e71006/attachment.sig>


More information about the Legacy mailing list