database vs filesystem performance

Mon Aug 8 01:34:03 UTC 2005

On August 7, 2005 21:27, Francois Ouellette wrote:
> ----- Original Message -----
> From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
> To: <tlug-lxSQFCZeNF4 at public.gmane.org>
> Sent: Sunday, 07 August, 2005 21:04
> Subject: Re: [TLUG]: database vs filesystem performance
>
> On August 7, 2005 20:52, Francois Ouellette wrote:
> > ----- Original Message -----
> > From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
> > To: <tlug-lxSQFCZeNF4 at public.gmane.org>
> > Sent: Sunday, August 07, 2005 7:50 PM
> > Subject: [TLUG]: database vs filesystem performance
> >
> > > Does somebody know the compared performance of the filesystem against a
> >
> > RDBMS?
> >
> > > Thanks
> >
> > Bonjour Marc,
> >
> > Not an easy thing to compare, a RDBMS is immensely more complex than a
> > filesystem, which it uses somehow as the physical repository of the data.
> > Some products use raw files and bypass the O/S filesystem, some other use
> > plain files which contain the indexing information and/or the data
> > itself, usually organized in "pages" which are large records containing
> > one or
>
> more
>
> > tuples of a table. Most RDBMS use large caches of data to try to reduce
>
> the
>
> > amount of I/O operations to/from the physical files. Plus, with a RDBMS
> > we have the transactional aspect of processing the data, which adds
> > another area of processing and disk operations everytime we want to
> > update some data.
> >
> > My experienced with very large applications is that reading and
> > processing lots of data (i.e. hundreds of thousands of records) was
> > thousands of
>
> times
>
> > faster with flat files compared to a RDBMS. But it is not really
> > practical to try to compare the functionality of a RDBMS system with
> > simple file operations.
> >
> >   François Ouellette
> > <fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>
> >
> >
> >Bonjour François!
> >
> >I know my question is quite vague. Let me try to throw a little more
>
> details.
>
> >I am working on an application which pulls data from a cache at a very
> > fast rate (configurable but it could be up to 3000 times by second)
> > therefore performance -in terms of speed- is an issue. The data is in
> > itself binary. Some people suggested to convert this into strings and to
> > store it in the database (MS-SQL). I heard that databases are not that
> > good with binary
>
> data
>
> >and also that the filesystem has been designed to handle files
> > efficiently.
>
> I
>
> >wonder what is best between storing this data in the database (eventually
>
> as
>
> >string or as a BLOB) or dumping it into a big file (or breaking it in a
> >sequence of files eventually). The journaling filesystem would make this
> >reliant to a system crash, right?
>
> You may be able to read 3000+ records per second from a RDBMS once the data
> has been read and put into its server cache.
> However, as you say, RDBMS are not very good at storing binary data apart
> from the usual integer and float formats.
> BLOBS are a possibility but some RDBMS store them as single flat files too!
> Unless you really have to use a RDBMS because of some requirement of the
> application my suggestion would be to stick to flat files.
>
> My experience with journaled file systems available on most UNIX platforms
> and also Linux (reiserfs and ext3) systems has so far been positive, the
> best I had to work with was the Advanced File System of the declining Tru64
> system, originally developed for DEC-OSF.

Do you know how it does compare with MS NTFS?
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml