Q: Mailbox format

Taavi Burns taavi-LbuTpDkqzNzXI80/IeQp7B2eb7JE58TQ at public.gmane.org
Wed Apr 28 01:37:57 UTC 2004


On Tue, Apr 27, 2004 at 03:49:43PM -0400, Henry Spencer wrote:
> On Tue, 27 Apr 2004, Taavi Burns wrote:
> > It could...but then you'd either have to go through huge data
> > transformations in memory and ensure that it syncs with disk all the
> > time...
> 
> Not really.  With careful thought, upward-compatible upgrades are often
> possible.  (Not always, but often.)  Data format often can stay exactly the
> same during improvements.

cf the ext3 journal being a hidden, but otherwise ordinary file on an
ext2 partition.

> For example, maintaining on-disk consistency in a traditional filesystem
> design does *NOT*, repeat *NOT*, require that changes be written out
> immediately -- it only requires that changes be written out in the right
> order.  More precisely, it requires that the order in which changes are
> written out must observe certain constraints.  Tracking the order constraints
> is more complicated than writing things out immediately, but it's not
> prohibitive, and performance is vastly better with no change in on-disk data
> format whatsoever.  "Work smarter, not harder."

I totally agree.  But I ask if it may not be more worthwhile to work
smarter AND less at the same time, by not having to do monstrous data
conversions.

> Even when on-disk format changes are needed, cleverness often permits doing
> them in an upward-compatible way.

Clever can easily degenerate into stupid, if not kept under tight reins.

> > Traditional UNIX filsystems just don't handle thousands of files very well,
> > since they store directories as flat files.  Really, that's silly...
> 
> Rather, it's a tradeoff that doesn't scale up well.  But fixing it can
> usually be done without tearing everything up and starting over.  Just takes
> some effort and some intelligence. 

A tradeoff in what form?  Why do we bother storing directories as unsorted
lists of filenames an inode numbers?  There's no excuse that it's as
legible as something like XML.  Granted, it's easier to parse than a
B+ tree, but how often does someone need to extract such low-level information
from a drive, that it's worth all the extra work in the common case?

> Forcing people to abandon their old filesystems, tools, etc. and start over
> just to get performance gains is stupid, inconsiderate, and usually
> unnecessary. 

Reiser4 proposes far more than performance gains.  Filesystem activites
become atomic operations, somewhat like a row update in a database.  Why
is that something interesting for a filesystem to do?  I'm not personally
sure but, as the saying goes, "If you build it, they will come."

I'm sure that those developing mailservers will like that property a lot.

There's also no "force" for people to abandon their old filesystems. They
can keep using them, just as people can quite easily (well, insofar as
you can find an installer that doesn't run out of RAM) install Linux
on a 486 and have a decent webbrowser or X console.  Really, filesystems
are utilities anyway; users should not be concerned with their operation
or even existence: they should just work fast, and work well.

-- 
taa

    We know today that the brain we are born with is
    not the finished product it was once thought to be.
    The structuring of the brain depends very much on
    experiences gone through in the first hours, days,
    and weeks of a person's life.
          --Alice Miller, Ph.D.
/*eof*/
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list