memory overcommitment

Mon Jan 30 15:24:35 UTC 2012

On Sun, Jan 29, 2012 at 01:12:27AM -0500, D. Hugh Redelmeier wrote:
> When Jim Mercer spoke at our meeting earlier this month, one Linux feature 
> that had him shaking his head was memory overcommitment.
> 
> Normally, Linux will effectively promise processes more memory than it can 
> deliver.  This is called overcommitment.
> 
> Why would this be a good idea?  Because many processes don't use the worst 
> case of what they can, and usually not all running processes are 
> simultaneously using their maximal amount.
> 
> Why would this be a bad idea?  Because failure (i.e. running out of memory 
> and swap space to support the requirements) is hard to recover from and it 
> is not connected in any obvious way with the requests that caused the 
> problem.  The way Linux recovers is the "OOM killer" -- a process that 
> kills dead somewhat random processes in the hope that memory requirements 
> become become small enough to handle.
> 
> The OOM Handler behaviour makes it extremely hard to reason about how a 
> system will perform.
> 
> You can tell Linux not to overcommit.  That sounds easy and reasonable.  I 
> didn't know why you would not want to do this.
> 
> Here's an interesting article that addresses the issue: 
> <http://www.quora.com/What-are-the-disadvantages-of-disabling-memory-overcommit-in-Linux>
> 
> If I read this right, it seems as if Linux has a cute generalization that 
> makes it expensive to avoid overcommitment.
> 
> Copy On Write (COW) is really neat.  Two processes created by forking can 
> share all their memory until one writes to it.  Then the page that is 
> written two must be copied so that each process has its own copy.
> 
> With COW, a fork causes almost no memory use but writes afterwards do.  To 
> properly account for this without overcommitment, the system would have to 
> account for complete copying on fork, even though that is probably way too 
> generous.
> 
> But: in a properly designed system, much of the memory of a process should 
> be read-only and never have to be copied: the program code itself and any 
> constants.  Reading between the lines, it sounds as if Linux doesn't 
> enforce this.

x86 had no way to enforce this.  x86_64 added support for marking a page
no execute (data only).  Well x86 could enforce read only on the code
pages, but it couldn't prevent a program from executing a data page.
Of course since some programs thought self modifying code was a good
idea, I am not sure linux did enforce read only code pages in the past,
or if it even does now.

> I wonder if it is true.
> 
> Even if this were fixed, fork needs to account for duplication of all the 
> heap and stack.  If, as is often the case, the next serious thing that the 
> child process does is exec, the allocation of the space for the child's 
> heap and stack is a wicked waste.  Another argument for the abomination 
> vfork which recognized this idiom.
> 
> Reading in between the lines in these documents, I'm guessing that glibc's 
> malloc implementation assumes that allocating memory that isn't used 
> doesn't have a cost.  This is not true if overcommit is forbidden.

Not really glibc's problem when an application requests 100MB and
only needs 1MB.  In the cases where glibc chooses to deal with small
allocations in a heap rather than bother the kernel for each malloc,
then it is different, but I don't think those are the cases that really
matter in general.

> Still a bit of a puzzle to me.

-- 
Len Sorensen
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists