[GTALUG] On the subject of backups.

Jamon Camisso jamon.camisso at utoronto.ca
Mon May 4 12:31:07 EDT 2020


On 2020-05-04 09:55, Alvin Starr via talk wrote:
> 
> I am hoping someone has seen this kind of problem before and knows of a
> solution.
> I have a client who has file systems filled with lots of small files on
> the orders of hundreds of millions of files.
> Running something like a find on filesystem takes the better part of a
> week so any kind of directory walking backup tool will take even longer
> to run.
> The actual data-size for 100M files is on the order of 15TB so there is
> a lot of data to backup but the data only increases on the order of tens
> to hundreds of MB a day.
> 
> 
> Even things like xfsdump take a long time.
> For example I tried xfsdump on a 50M file set and it took over 2 days to
> complete.
> 
> The only thing that seems to be workable is Veeam.
> It will run an incremental volume snapshot in a few hours a night but I
> dislike adding proprietary kernel modules into the systems.

If you have a list of inodes on the filesystem you can use xfs_db directly:

xfs_db> inode 128
xfs_db> blockget -n
xfs_db> ncheck
        131 dir/.
        132 dir/test2/foo/.
        133 dir/test2/foo/bar
      65664 dir/test1/.
      65665 dir/test1/foo
      65666 dir/test3/foo/.
     142144 dir/test2/.
     142145 dir/test3/foo/bar/.
     142146 dir/test3/foo/bar/baz
     196736 dir/test3/.

I don't know how that will perform relative to something like find though.

Cheers, Jamon


More information about the talk mailing list