error on /var/log/messages --> oom problem?

David Thornton david-FkEgs2FKm2NvBvnq28/GKQ at public.gmane.org
Fri May 6 13:50:46 UTC 2005


I highly recommend setting up monitoring tools to examine the memory 
usage over time.

http://www.quadratic.net/~david/gnuplot/#Sar_Analysis

As far as I can tell linux caches disk reads until all memeory is used 
(at least iwth this version). Which leaves precious little headroom for 
new allocations.
I have yet to understand all of this, despite reading up a bit:

http://www.csn.ul.ie/~mel/projects/vm/guide/pdf/understand.pdf

I'm a big ITIL freek, so a reboot crosses me as "incident management" 
and really fixing the cause is "problem management".
Reboot is a good short term fix (incident resolution) , but we should be 
working toward the long term fix (problem resolution).

In our case we had 5 web servers running tomcat. We moved to resin and 
the problem went away.

Java apps are notorius ( in my experiance) with being poorly created. My 
experience has been that app delveopers think that because of all the 
fancy memory management that java does , they can build apps that use 
memeory like it's an unlimited resource. I'm not an software dev guy so 
I'm relegated to pooring through code I don't "get" and suggesting that 
maybe an sql lookup would be better than a java collection.

sigh.

david




Jerome Macaranas wrote:

>hmm.. i havent tried rebooting the box.. but im thinking of a long term 
>solution without rebooting.. would adding addtional memory modules work? 
>
>On Friday 06 May 2005 11:45, billt-lxSQFCZeNF4 at public.gmane.org wrote:
>  
>
>>This looks like you ran out of memory + swap to me.
>>
>>Did a reboot fix anything?
>>
>>Bill
>>
>>On Fri, May 06, 2005 at 11:27:41AM +0800, JM wrote:
>>    
>>
>>>i got this data from our server killing almost all of our services.. can
>>>someone help me here... do i need to upgrade or anything..
>>>
>>>im using redhat 9 with 2.4.27 SMP
>>>
>>>tia,
>>>
>>>May  5 18:10:55 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:10:55 earth kernel: VM: killing process sshd
>>>May  5 18:10:55 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:10:55 earth kernel: VM: killing process modprobe
>>>May  5 18:10:55 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:10:55 earth kernel: VM: killing process smsbox
>>>May  5 18:11:42 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:11:42 earth kernel: VM: killing process smsbox
>>>May  5 18:11:48 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:11:48 earth kernel: VM: killing process
>>>iwrite_globe
>>>May  5 18:11:49 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:11:49 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:11:49 earth kernel: VM: killing
>>>process java
>>>May  5 18:11:49 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:11:49 earth kernel: VM: killing process java
>>>May  5 18:13:09 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:13:09 earth kernel: VM: killing process java
>>>May  5 18:13:36 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:14:28 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:14:28 earth kernel: VM: killing
>>>process croncallboykann May  5 18:14:29 earth kernel: __alloc_pages:
>>>0-order allocation failed (gfp=0x1d2/0) May  5 18:14:29 earth kernel: VM:
>>>killing process gsmsbox
>>>May  5 18:14:29 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:14:29 earth kernel: VM: killing process modprobe
>>>May  5 18:14:29 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:14:29 earth kernel: VM: killing process
>>>mms_forwarder
>>>May  5 18:14:29 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:14:29 earth kernel: VM: killing process java
>>>May  5 18:14:30 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:29 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:15:29 earth kernel: VM: killing
>>>process modprobe
>>>May  5 18:15:32 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:32 earth kernel: VM: killing process smsbox
>>>May  5 18:15:34 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:34 earth kernel: VM: killing process modprobe
>>>May  5 18:15:36 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:36 earth kernel: VM: killing process sh
>>>May  5 18:15:38 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:39 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:15:39 earth kernel: VM: killing
>>>process sendmail
>>>May  5 18:15:40 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:40 earth kernel: VM: killing process crond
>>>May  5 18:15:40 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:15:40 earth kernel: VM: killing process
>>>iwrite_globe
>>>May  5 18:15:40 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:16:12 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:16:12 earth kernel: VM: killing
>>>process croncallboykann May  5 18:16:12 earth kernel: __alloc_pages:
>>>0-order allocation failed (gfp=0x1d2/0) May  5 18:16:16 earth last
>>>message repeated 2 times
>>>May  5 18:16:16 earth kernel: VM: killing process forwarder
>>>May  5 18:16:17 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:16:17 earth kernel: VM: killing process smsbox
>>>May  5 18:16:17 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:16:17 earth kernel: VM: killing process java
>>>May  5 18:16:18 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:16:18 earth kernel: VM: killing process curl
>>>May  5 18:16:19 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:17:22 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:17:22 earth kernel: kmod:
>>>failed to exec /sbin/modprobe -s -k net-pf-10, errno = 12 May  5 18:17:22
>>>earth kernel: __alloc_pages: 0-order allocation failed (gfp=0x1d2/0) May 
>>>5 18:17:22 earth kernel: VM: killing process java
>>>May  5 18:17:22 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:17:22 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:17:22 earth kernel: VM: killing
>>>process smsbox
>>>May  5 18:17:22 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:17:22 earth kernel: VM: killing process emacs
>>>May  5 18:18:01 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:18:01 earth kernel: VM: killing process gsmsbox
>>>May  5 18:18:01 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:18:01 earth kernel: VM: killing process sendmail
>>>May  5 18:18:37 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:18:37 earth kernel: VM: killing process sendmail
>>>May  5 18:18:37 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:18:38 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:18:38 earth kernel: VM: killing
>>>process sh
>>>May  5 18:19:01 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:19:01 earth kernel: VM: killing process java
>>>May  5 18:19:48 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:19:48 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:19:48 earth kernel: VM: killing
>>>process smsbox
>>>May  5 18:19:49 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:19:49 earth kernel: VM: killing process
>>>off_keyword
>>>May  5 18:19:49 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:19:49 earth kernel: VM: killing process java
>>>May  5 18:20:29 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:22:27 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:22:30 earth kernel:
>>>__alloc_pages: 0-order allocation failed (gfp=0x1d2/0) May  5 18:22:30
>>>earth kernel: VM: killing process smsbox
>>>May  5 18:22:31 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:22:31 earth kernel: VM: killing process smsbox
>>>May  5 18:22:55 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:22:55 earth kernel: VM: killing process modprobe
>>>May  5 18:22:56 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:23:14 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:23:14 earth kernel: VM: killing
>>>process sh
>>>May  5 18:23:14 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:23:14 earth kernel: VM: killing process
>>>dlr_handler_236 May  5 18:23:15 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:23:16 earth last message
>>>repeated 2 times
>>>May  5 18:23:16 earth kernel: VM: killing process smsbox
>>>May  5 18:23:16 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:23:16 earth kernel: VM: killing process crond
>>>May  5 18:23:16 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:23:16 earth kernel: VM: killing process
>>>dlr_handler_236 May  5 18:23:16 earth kernel: __alloc_pages: 0-order
>>>allocation failed (gfp=0x1d2/0) May  5 18:23:16 earth kernel: VM: killing
>>>process smsbox
>>>May  5 18:23:16 earth kernel: __alloc_pages: 0-order allocation failed
>>>(gfp=0x1d2/0) May  5 18:23:16 earth kernel: VM: killing process java
>>>May  5 18:30:38 earth sshd[25651]: Accepted publickey for nsadmin from
>>>192.168.6.70 port 59842 ssh2 May  5 18:36:10 earth sshd[2006]: Accepted
>>>publickey for nsadmin from 192.168.6.70 port 59877 ssh2 May  5 18:38:17
>>>earth sshd[5160]: Accepted password for nsadmin from 192.168.6.35 port
>>>1964 ssh2 --
>>>The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
>>>TLUG requests: Linux topics, No HTML, wrap text below 80 columns
>>>How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
>>>      
>>>
>>--
>>The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
>>TLUG requests: Linux topics, No HTML, wrap text below 80 columns
>>How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
>>    
>>
>--
>The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
>TLUG requests: Linux topics, No HTML, wrap text below 80 columns
>How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
>  
>


-- 
Let one walk alone,
commiting no sin with few wishes,
like an elephant in the forest.
-- ghost in the shell 2: Innocence

--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list