[GTALUG] On the subject of backups.

Greg Martyn greg.martyn at gmail.com
Wed May 6 20:16:28 EDT 2020


I haven't used Gluster personally, but have you tried
turning performance.parallel-readdir on?
https://docs.gluster.org/en/latest/release-notes/3.10.0/#implemented-parallel-readdirp-with-distribute-xlator

It seems there's a reason why it's on by default (
https://www.spinics.net/lists/gluster-devel/msg25518.html) but maybe it'd
still be worth it for you?

On Mon, May 4, 2020 at 9:55 AM Alvin Starr via talk <talk at gtalug.org> wrote:

>
> I am hoping someone has seen this kind of problem before and knows of a
> solution.
> I have a client who has file systems filled with lots of small files on
> the orders of hundreds of millions of files.
> Running something like a find on filesystem takes the better part of a
> week so any kind of directory walking backup tool will take even longer
> to run.
> The actual data-size for 100M files is on the order of 15TB so there is
> a lot of data to backup but the data only increases on the order of tens
> to hundreds of MB a day.
>
>
> Even things like xfsdump take a long time.
> For example I tried xfsdump on a 50M file set and it took over 2 days to
> complete.
>
> The only thing that seems to be workable is Veeam.
> It will run an incremental volume snapshot in a few hours a night but I
> dislike adding proprietary kernel modules into the systems.
>
>
> --
> Alvin Starr                   ||   land:  (647)478-6285
> Netvel Inc.                   ||   Cell:  (416)806-0133
> alvin at netvel.net              ||
>
> ---
> Post to this mailing list talk at gtalug.org
> Unsubscribe from this mailing list
> https://gtalug.org/mailman/listinfo/talk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gtalug.org/pipermail/talk/attachments/20200506/f078d807/attachment.html>


More information about the talk mailing list