Segfault Suggestions?

Lennart Sorensen lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org
Tue Mar 2 15:16:45 UTC 2004


On Tue, Mar 02, 2004 at 09:44:41AM -0500, Peter King wrote:
> On Mon, Mar 01, 2004 at 11:22:38AM -0500, JoeHill wrote:
> 
> > Another test you could try running is 'mprime'
> > 
> > ftp://lettuce.edsc.ulst.ac.uk/mirrors/www.mersenne.org/gimps/mprime2212.tar.gz
> 
> UPDATE. I did download and run mprime -- a program to calculate
> and check for Mersenne primes -- and ran its "torture test," which
> is designed to push memory and hardware to its limits. I set it to
> run overnight on the problem Dell. Lo and behold, this morning I
> found the Dell crashed with its keyboard lights flashing, and the 
> following cryptic message:
> 
> Test 57, 1000 Lucas-Lehmer iterations of M225281 using 12k FFT length
>   Unable to handle kernel paging request at virtual address fffffffe
>   printing eip:
>    c82c986c
>  *pde=00002063
>  *pte=00000000
>  Oops: 0002
> <snip>
>  kernel panic: Aieee, killing interrupt handler!
> 
> Well, this is all very exciting. It looks like the Dell went down due
> to a kernel paging request -- which means, I think, that the kernel
> asked for some memory that it expected to have, and the Dell did not
> provide it. That provides good reasons for suspecting the memory rather
> than the CPU.
> 
> Is that the right way to read the information? If not, what does it mean?

Not necesarily.  It means that either the ram, the motherboard or the
cpu did something wrong for that request.  If you can run it multiple
times and get the rror to occour on the SAME address (or very close to
it) perhaps it would start to indicate a certain location in ram has
gone bad.  Well or if it was a swap failure maybe the disk has a bad
sector, but I don't think that error was from failed swap.

I think what you can tell for sure is that you have some flacky
hardware.

I must admit I find the virtual address fffffffe very suspicious.  That
is one byte from the end of the potential address space on a 32bit
machine.  Almost looks like the address is wrong (which could mean
something mangled a pointer stored elsewhere.)

Lennart Sorensen
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list