Shared Memory

Taavi Burns jaaaarel-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Tue Feb 1 15:34:03 UTC 2005


On Tue, 1 Feb 2005 11:17:28 -0500, John Macdonald <john-Z7w/En0MP3xWk0Htik3J/w at public.gmane.org> wrote:
> This does put a higher urgency to an application's need
> to ensure that critically ordered I/O actually happens in
> order, though.  If multiple I/Os are issued at the same time
> (the next one started before the previous one has completed)
> the large queue means that there is a greater chance of the
> second being completed significantly earlier than the first.

For read operations this is never a problem (unless the program has
been written in a silly way).

> A power failure at the wrong time, and that can be catastrophic.
> For example, a database program will often have to ensure that
> a log write is fully complete before it can safely proceed
> with committing a transaction - otherwise it could partially
> process the transaction, have a system crash lose the rest
> *and also prevent the completion of the log write*, and then
> on restart not have the log record to let it know that the
> partially completed transaction has to either be completed or
> rolled back.  (Of course, you'd really want a UPS to prevent
> a power failure from causing an immediate failure.)

That is why the database application will request a flush of the disk
cache (including all pending writes) at sanity checkpoints.  This doesn't
only apply to database applications, but to all journalled and ordered-
writes filesystems.

> (An application that carelessly uses overlapping I/O is already
> broken, but with small queues it may have been "getting away"
> with its broken behaviour because there is a small window for
> critical failure; the larger window can make the code that is
> broken in theory be more likely to fail in practice.)

If it was simply "getting away" with the smaller queues, then it truly
was asking to die at any time due to a freak accident.  Games might
be produced like this (I've heard of savegames and config saves causing
problems when power fluctuations occur), but any robust piece of software
that contains potentially valuable user data must ensure, so far as it is
able, that things happen in the correct order.

I also suspect that there are certain ordering rules set forth by the hardware
and/or kernel to help ensure data integrity in the general case.  But when in
doubt, fsync().  ;)

-- 
taa
/*eof*/
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list