database vs filesystem performance

Marc Lijour marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org
Mon Aug 8 04:37:21 UTC 2005


On August 8, 2005 00:32, Ansar Mohammed wrote:
> It all comes down to the nature of your application and data.
> Is your application read only? Are you modifying data? How large are the
> files and what kind of files are they?

I am just getting a very fast stream of binary data which I have to store 
(fast) with the idea of retrieving later to process it. Hence it must be 
indexed in some way, but a coarse-grained indexing should work (many files 
may be).

> Database servers are really for accessing searching and manipulating
> structured data not for just storing and accessing files. If you are
> storing searchable metadata *about* your files then the best solution is
> probably a hybrid where the metadata is on the database and the files are
> on a separate filesystem.
>
> If you are storing on a filesystem then you will have other disk IO issues,
> such as file fragmentation and locking.
>
> Most caching applications are based on similar principles, separate index
> and storage area, squid, ISA, even internet explorer ;)
>
> If your files are larger than the pagesize of your database, then I would
> not consider storing them on a database. There would simply be too much IO
> overhead. (remember that the data inside a database also gets fragmented
> and most modern database systems have to perform cleanup on a regular basis
> to ensure that the data remain as contiguous as possible).
>
>
>
>
> -----Original Message-----
> From: owner-tlug-lxSQFCZeNF4 at public.gmane.org [mailto:owner-tlug-lxSQFCZeNF4 at public.gmane.org] On Behalf Of Marc Lijour
> Sent: August 7, 2005 9:05 PM
> To: tlug-lxSQFCZeNF4 at public.gmane.org
> Subject: Re: [TLUG]: database vs filesystem performance
>
> On August 7, 2005 20:52, Francois Ouellette wrote:
> > ----- Original Message -----
> > From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
> > To: <tlug-lxSQFCZeNF4 at public.gmane.org>
> > Sent: Sunday, August 07, 2005 7:50 PM
> > Subject: [TLUG]: database vs filesystem performance
> >
> > > Does somebody know the compared performance of the filesystem against a
> >
> > RDBMS?
> >
> > > Thanks
> >
> > Bonjour Marc,
> >
> > Not an easy thing to compare, a RDBMS is immensely more complex than a
> > filesystem, which it uses somehow as the physical repository of the data.
> > Some products use raw files and bypass the O/S filesystem, some other use
> > plain files which contain the indexing information and/or the data
> > itself, usually organized in "pages" which are large records containing
> > one or
>
> more
>
> > tuples of a table. Most RDBMS use large caches of data to try to reduce
>
> the
>
> > amount of I/O operations to/from the physical files. Plus, with a RDBMS
> > we have the transactional aspect of processing the data, which adds
> > another area of processing and disk operations everytime we want to
> > update some data.
> >
> > My experienced with very large applications is that reading and
> > processing lots of data (i.e. hundreds of thousands of records) was
> > thousands of
>
> times
>
> > faster with flat files compared to a RDBMS. But it is not really
> > practical to try to compare the functionality of a RDBMS system with
> > simple file operations.
> >
> >   François Ouellette
> > <fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>
>
> Bonjour François!
>
> I know my question is quite vague. Let me try to throw a little more
> details.
> I am working on an application which pulls data from a cache at a very fast
> rate (configurable but it could be up to 3000 times by second) therefore
> performance -in terms of speed- is an issue. The data is in itself binary.
> Some people suggested to convert this into strings and to store it in the
> database (MS-SQL). I heard that databases are not that good with binary
> data
>
> and also that the filesystem has been designed to handle files efficiently.
> I
> wonder what is best between storing this data in the database (eventually
> as
>
> string or as a BLOB) or dumping it into a big file (or breaking it in a
> sequence of files eventually). The journaling filesystem would make this
> reliant to a system crash, right?
> --
> The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
>
> --
> The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
> TLUG requests: Linux topics, No HTML, wrap text below 80 columns
> How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml
--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list