Shared Memory

Lennart Sorensen lsorense-1wCw9BSqJbv44Nm34jS7GywD8/FfD2ys at public.gmane.org
Tue Feb 1 15:29:51 UTC 2005


On Tue, Feb 01, 2005 at 11:17:28AM -0500, John Macdonald wrote:
> On Tue, Feb 01, 2005 at 09:09:11AM -0500, Lennart Sorensen wrote:
> This does put a higher urgency to an application's need
> to ensure that critically ordered I/O actually happens in
> order, though.  If multiple I/Os are issued at the same time
> (the next one started before the previous one has completed)
> the large queue means that there is a greater chance of the
> second being completed significantly earlier than the first.
> A power failure at the wrong time, and that can be catastrophic.
> For example, a database program will often have to ensure that
> a log write is fully complete before it can safely proceed
> with committing a transaction - otherwise it could partially
> process the transaction, have a system crash lose the rest
> *and also prevent the completion of the log write*, and then
> on restart not have the log record to let it know that the
> partially completed transaction has to either be completed or
> rolled back.  (Of course, you'd really want a UPS to prevent
> a power failure from causing an immediate failure.)

This is why disks don't do write caching by default (except some
broken IDE drives), they don't queue writes, they simply complete ASAP.
Besides what percentage of disk IO is reads vs. writes?  Optimizing
reads makes much more sense.

As long as the drive doesn't say it has completed the write until it
actually has, you are ok, which is how it works (as far as I have
understood it).

AS long as the drive doesn't say it has completed the write until it
actually has, you are ok, which is how it works (as far as I have
understood it).  Sorting the writes if you get a few at a time, would
actualyl get your data written to disk faster, making it less likely a
power failure would cause problems.  But writes should certainly be
ordered ahead of reads as far as I am concerned.

> (An application that carelessly uses overlapping I/O is already
> broken, but with small queues it may have been "getting away"
> with its broken behaviour because there is a small window for
> critical failure; the larger window can make the code that is
> broken in theory be more likely to fail in practice.)

Any application that cares about data integrety does a write, then a
fsync which will not return until the data is commited (unless the OS is
broken, or you have a defective drive design that does write caching by
default.)

Lennart Sorensen
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list