database vs filesystem performance

Francois Ouellette fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org
Mon Aug 8 01:27:34 UTC 2005


----- Original Message ----- 
From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
To: <tlug-lxSQFCZeNF4 at public.gmane.org>
Sent: Sunday, 07 August, 2005 21:04
Subject: Re: [TLUG]: database vs filesystem performance


On August 7, 2005 20:52, Francois Ouellette wrote:
> ----- Original Message -----
> From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
> To: <tlug-lxSQFCZeNF4 at public.gmane.org>
> Sent: Sunday, August 07, 2005 7:50 PM
> Subject: [TLUG]: database vs filesystem performance
>
> > Does somebody know the compared performance of the filesystem against a
>
> RDBMS?
>
> > Thanks
>
> Bonjour Marc,
>
> Not an easy thing to compare, a RDBMS is immensely more complex than a
> filesystem, which it uses somehow as the physical repository of the data.
> Some products use raw files and bypass the O/S filesystem, some other use
> plain files which contain the indexing information and/or the data itself,
> usually organized in "pages" which are large records containing one or
more
> tuples of a table. Most RDBMS use large caches of data to try to reduce
the
> amount of I/O operations to/from the physical files. Plus, with a RDBMS we
> have the transactional aspect of processing the data, which adds another
> area of processing and disk operations everytime we want to update some
> data.
>
> My experienced with very large applications is that reading and processing
> lots of data (i.e. hundreds of thousands of records) was thousands of
times
> faster with flat files compared to a RDBMS. But it is not really practical
> to try to compare the functionality of a RDBMS system with simple file
> operations.
>
>   François Ouellette
> <fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>
>
>
>Bonjour François!
>
>I know my question is quite vague. Let me try to throw a little more
details.
>I am working on an application which pulls data from a cache at a very fast
>rate (configurable but it could be up to 3000 times by second) therefore
>performance -in terms of speed- is an issue. The data is in itself binary.
>Some people suggested to convert this into strings and to store it in the
>database (MS-SQL). I heard that databases are not that good with binary
data
>and also that the filesystem has been designed to handle files efficiently.
I
>wonder what is best between storing this data in the database (eventually
as
>string or as a BLOB) or dumping it into a big file (or breaking it in a
>sequence of files eventually). The journaling filesystem would make this
>reliant to a system crash, right?

You may be able to read 3000+ records per second from a RDBMS once the data
has been read and put into its server cache.
However, as you say, RDBMS are not very good at storing binary data apart
from the usual integer and float formats.
BLOBS are a possibility but some RDBMS store them as single flat files too!
Unless you really have to use a RDBMS because of some requirement of the
application my suggestion would be to stick to flat files.

My experience with journaled file systems available on most UNIX platforms
and also Linux (reiserfs and ext3) systems has so far been positive, the
best I had to work with was the Advanced File System of the declining Tru64
system, originally developed for DEC-OSF.

  François Ouellette
<fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>




--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list