database vs filesystem performance

Francois Ouellette fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org
Tue Aug 9 11:54:38 UTC 2005


----- Original Message -----
From: "Marc Lijour" <marc-bbkyySd1vPWsTnJN9+BGXg at public.gmane.org>
To: <tlug-lxSQFCZeNF4 at public.gmane.org>
Sent: Monday, August 08, 2005 8:29 PM
Subject: Re: [TLUG]: database vs filesystem performance


> On August 8, 2005 06:47, Walter Dnes wrote:
> > On Mon, Aug 08, 2005 at 12:37:21AM -0400, Marc Lijour wrote
> >
> > > On August 8, 2005 00:32, Ansar Mohammed wrote:
> > > > It all comes down to the nature of your application and data.
> > > > Is your application read only? Are you modifying data? How large are
> > > > the files and what kind of files are they?
> > >
> > > I am just getting a very fast stream of binary data which I have to
store
> > > (fast) with the idea of retrieving later to process it. Hence it must
be
> > > indexed in some way, but a coarse-grained indexing should work (many
> > > files may be).
> >
> >   See http://www.unitedlinux.com/pdfs/whitepaper4.pdf for a discussion
> > on file system limits.  3000 files/second adds up really quickly.
> >
> > [m1800][waltdnes][~] echo $(( 3000 * 3600 * 24 * 365 ))
> > 94608000000
> >
> > [m1800][waltdnes][~] echo $(( 3000 * 3600 * 24 * 366 ))
> > 94867200000
>
> In that case it would make sense to concatenate some of the info...

That was my thought too, you do not want to put every binary stream in its
own file, do you?
How will these things be identified, does each one has a key or name?
Sometimes we can use a database to store the retrieval information while
leaving the data in flat files.
I have seen data warehousing products working like that.

  François Ouellette
<fouellet-cpI+UMyWUv9BDgjK7y7TUQ at public.gmane.org>


--
The Toronto Linux Users Group.      Meetings: http://tlug.ss.org
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://tlug.ss.org/subscribe.shtml





More information about the Legacy mailing list