[GTALUG] On the subject of backups.
Alvin Starr
alvin at netvel.net
Mon May 4 14:12:39 EDT 2020
On 5/4/20 1:26 PM, Lennart Sorensen via talk wrote:
> On Mon, May 04, 2020 at 04:38:28PM +0200, ac via talk wrote:
>> Hi Alvin,
>>
>> On a 2TB dataset, with +-600k files, I have piped tree to less with
>> limited joy, it took a few hours and at least I could search for
>> what I was looking for... - 15TB and 100M is another animal though
>> and as disk i/o will be your bottleneck, anything will take long, no?
>>
>> now, for my own info/interest, can you tell me which fs is used for this
>> ext3?
> Hmm, sounds awful slow.
>
> Just for fun I ran find on one of my drives:
>
> # time find /data | wc -l
> 1825463
> real 3m57s.208s
>
> That is with 5.3T used out of 6.0TB.
>
> Running it a second time when it is cached takes 7.7s. Tree takes 14.7s.
>
> Another volume:
> # time find /mythdata | wc -l
> 54972
>
> real 0m1.924s
>
> That is with 15 TB out of 15 TB in use (yes that one always fills up
> for some reason).
>
> Both of those are lvm volumes with ext4 on top of software raid6 using
> 5400rpm WD red drives.
>
> Seems either XFS is unbelievable bad, or there isn't enough ram to cache
> the filesystem metadata if you are having a problem with 100M files.
> I only have a measly 32GB in my home machine.
I believe the directory hierarchy has a lot to do with the performance.
It seems that the listing time is non-linear although I do not believe
itsĀ an N^^2 kind of problem.
I would have said the same as you before I started having to deal with
10's of millions of files.
--
Alvin Starr || land: (647)478-6285
Netvel Inc. || Cell: (416)806-0133
alvin at netvel.net ||
More information about the talk
mailing list