database vs filesystem performance

Ansar Mohammed ansarm-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org
Mon Aug 8 04:32:00 UTC 2005


It all comes down to the nature of your application and data. 
Is your application read only? Are you modifying data? How large are the
files and what kind of files are they?

Database servers are really for accessing searching and manipulating
structured data not for just storing and accessing files. If you are storing
searchable metadata *about* your files then the best solution is probably a
hybrid where the metadata is on the database and the files are on a separate
filesystem.

If you are storing on a filesystem then you will have other disk IO issues,
such as file fragmentation and locking. 

Most caching applications are based on similar principles, separate index
and storage area, squid, ISA, even internet explorer ;) 

If your files are larger than the pagesize of your database, then I would
not consider storing them on a database. There would simply be too much IO
overhead. (remember that the data inside a database also gets fragmented and
most modern database systems have to perform cleanup on a regular basis to
ensure that the data remain as contiguous as possible).




-----Original Message-----
From: owner-tlug-lxSQFCZeNF4 at public.gmane.org [mailto:owner-tlug-lxSQFCZeNF4 at public.gmane.org] On Behalf Of Marc Lijour
Sent: August 7, 2005 9:05 PM
To: tlug-lxSQFCZeNF4 at public.gmane.org
Subject: Re: [TLUG]: database vs filesystem performance

On August 7, 2005 20:52, Francois Ouellette wrote:
> ----- Original Message -----
> From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
> To: <tlug-lxSQFCZeNF4 at public.gmane.org>
> Sent: Sunday, August 07, 2005 7:50 PM
> Subject: [TLUG]: database vs filesystem performance
>
> > Does somebody know the compared performance of the filesystem against a
>
> RDBMS?
>
> > Thanks
>
> Bonjour Marc,
>
> Not an easy thing to compare, a RDBMS is immensely more complex than a
> filesystem, which it uses somehow as the physical repository of the data.
> Some products use raw files and bypass the O/S filesystem, some other use
> plain files which contain the indexing information and/or the data itself,
> usually organized in "pages" which are large records containing one or
more
> tuples of a table. Most RDBMS use large caches of data to try to reduce
the
> amount of I/O operations to/from the physical files. Plus, with a RDBMS we
> have the transactional aspect of processing the data, which adds another
> area of processing and disk operations everytime we want to update some
> data.
>
> My experienced with very large applications is that reading and processing
> lots of data (i.e. hundreds of thousands of records) was thousands of
times
> faster with flat files compared to a RDBMS. But it is not really practical
> to try to compare the functionality of a RDBMS system with simple file
> operations.
>
>   François Ouellette
> <fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>


Bonjour François!

I know my question is quite vague. Let me try to throw a little more
details. 
I am working on an application which pulls data from a cache at a very fast 
rate (configurable but it could be up to 3000 times by second) therefore 
performance -in terms of speed- is an issue. The data is in itself binary. 
Some people suggested to convert this into strings and to store it in the 
database (MS-SQL). I heard that databases are not that good with binary data

and also that the filesystem has been designed to handle files efficiently.
I 
wonder what is best between storing this data in the database (eventually as

string or as a BLOB) or dumping it into a big file (or breaking it in a 
sequence of files eventually). The journaling filesystem would make this 
reliant to a system crash, right?
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml

--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list