[GTALUG] On the subject of backups.

Alvin Starr alvin at netvel.net
Mon May 4 11:02:37 EDT 2020


I am not quite sure where the breaking point is but I think part of the 
problem is that the directories start to get big.
The directory hierarchy is only 5 to 10 nodes deep.

Its running on xfs.


On 5/4/20 10:38 AM, ac wrote:
> Hi Alvin,
>
> On a 2TB dataset, with +-600k files, I have piped tree to less with
> limited joy, it took a few hours and at least I could search for
> what I was looking for... - 15TB and 100M is another animal though
> and as disk i/o will be your bottleneck, anything will take long, no?
>
> now, for my own info/interest, can you tell me which fs is used for this
> ext3?
>
> On Mon, 4 May 2020 09:55:51 -0400
> Alvin Starr via talk <talk at gtalug.org> wrote:
>> I am hoping someone has seen this kind of problem before and knows of
>> a solution.
>> I have a client who has file systems filled with lots of small files
>> on the orders of hundreds of millions of files.
>> Running something like a find on filesystem takes the better part of
>> a week so any kind of directory walking backup tool will take even
>> longer to run.
>> The actual data-size for 100M files is on the order of 15TB so there
>> is a lot of data to backup but the data only increases on the order
>> of tens to hundreds of MB a day.
>>
>>
>> Even things like xfsdump take a long time.
>> For example I tried xfsdump on a 50M file set and it took over 2 days
>> to complete.
>>
>> The only thing that seems to be workable is Veeam.
>> It will run an incremental volume snapshot in a few hours a night but
>> I dislike adding proprietary kernel modules into the systems.
>>
>>

-- 
Alvin Starr                   ||   land:  (647)478-6285
Netvel Inc.                   ||   Cell:  (416)806-0133
alvin at netvel.net              ||



More information about the talk mailing list