[GTALUG] What Not To Backup

Blaise Alleyne email+libre at blaise.ca
Tue Dec 27 15:41:59 EST 2016


On 23/12/16 12:11 PM, John Moniz via talk wrote:
> Hi everyone,
> 
> I'm backing up my system on a more regular basis and am trying to fine tune the
> files that I backup. I am looking for advice on what NOT to bother to backup on
> the /home directory.
> 
> I am using rsync (took a long time and lots of trials to figure out the man page
> - and still don't know 90% of it) and presently have the following on my
> exclude_list.txt:
> (Note: multiple items shown on one line are just for readability, each line in
> the file only has one item)
> 
> tmp* TMP*
> .cache* cache* Cache* CACHE* *CACHE *Cache *cache
> .cookies* cookies*
> Trash Trash* TRASH*
> Junk* junk*
> .gvfs
> Backups backups
> Crash*
> .xsession-errors*
> .macromedia
> .thumbnails
> .mozilla/firefox/*/thumbnails
> *.corrupt
> minidumps
> .local/share/gvfs*
> 
> I'd love to exclude things that perhaps one would never use from a backup to
> rebuild a system after an accidental clean wipe of all data.
> 

Personally, I wouldn't be comfortable with so many wildcards in an rsync
exclude. I compiled mine through trial and error, running manual backups
frequently at first and finding directories that constantly had new stuff to
backup with names like tmp or cache and excluding those.

Here's my rsync exclude file (might be a little out of date, as I constructed in
a few years back on an Ubuntu machine, now using Debian):
# These files are not necessary to backup
*.swp
.cache/
.config/banshee/covers/
.evolution/cache/
.gnash/
.gnome2/epiphany/favicon_cache/
.gnome2/epiphany/mozilla/epiphany/Cache/
.gvfs/
.liferea_1.8/cache/
.local/share/Trash/files/
.macromedia/
.mozilla/firefox*/Profiles/*.default/Cache/
.mozilla/firefox*/Profiles/*.default/*.sqlite
.mozilla/firefox*/Profiles/*.default/weave/
Trash.msf
.mythtv/themecache/
.pulse/
.thumbnails/
.icedove/*.default/Cache/
.icedove/*.default/ImapMail/
.wine/
.wapi
.xchat2/scrollback/


Now, some things in there are conscious decisions, like I'm excluding my
IceDove/Thunderbird ImapMail folder because I don't want to constantly rsync my
cache of my ImapMail folders -- I have proper backups directly from my mail
server instead.


I also have a bunch of other custom folders excluded on any given machine,
usually ~/Downloads/ or some kind of directory where I may download large files
like ISOs which I have no need or desire to be going through my backup system.


> Similarly, any recommendations of what I should back up outside of /home? I am
> thinking of things like /etc/fstab, files that would make it easier to recover
> from a crash or to upgrade a distro.
> 

Here's the script I put in /usr/local/bin/backup to run every hour or so on all
my laptops/N900:

#!/bin/sh
START=`date +%s`
LOCAL_HOST='192.168.2.160'
REMOTE_HOST='myhome.domain.tld'
DEST_DIR="backups/thinkpad-x60"  # this would be different on each client


# Try to connect to host locally, otherwise use remote connection
if ssh -q $LOCAL_HOST exit;
then
	HOST=$LOCAL_HOST
else
	HOST=$REMOTE_HOST
fi

echo "========== Backup to $HOST =========="
date

DESTINATION="$HOST:$DEST_DIR"

# Backup home directory (Note: this is only for primary user!)
echo "---------- ${HOME} ----------"
rsync -e ssh -avz --delete --delete-excluded
--exclude-from=${HOME}/.rsync.exclude --numeric-ids --relative ${HOME}
${DESTINATION}

rsync -e ssh -avz --relative --delete --exclude-from=${HOME}/.rsync.etc.exclude
/etc/ ${DESTINATION}/
rsync -e ssh -avz --relative --delete /usr/local/bin/ ${DESTINATION}/

# Calculate elapsed time
END=`date +%s`
ELAPSEDTIME=`expr $END - $START`
echo Finished at: `date` - "It took $ELAPSEDTIME seconds"



I can run manually with `backup`, but I have it set to run every hour.

Not I'm backing up most of /etc/ too. I have an exclude file that leaves out
some stuff there (also trial and error by running the backup manually and seeing
what perhaps wasn't necessary).


Some important notes!

Are there sensitive files you don't want bac

As other people have mentioned, this just mirrors your home directory to a
backup server somewhere, which is awesome, but not a real backup because you
can't go back to older versions.

So what I do is I use rsnapshot for versioned backups.

I have a server running at my apartment, and a server running at my parents' place.

- laptops and mobile devices do rsync mirrors to the local server
- the remote server and the opposite place does a nightly rsnapshot of the
backup directories (and other stuff on the servers)

This way, my laptops/mobiles mirror ~hourly to my living room server, but if
anything ever went wrong, I could go back to the last 7 days, last 4 weeks or
last 3-6 months in the rsnapshot backup. And with that being at my parents'
place, it's also in a separate physical location in the even of fire, flood,
theft, etc.

(If there was some kind of nuclear bomb or natural disaster or something that
took out physical locations across Toronto, then I wouldn't be covered, but I
also figure I'd have bigger problems to worry about...)


I used to backup /var, but I don't bother from laptops these days. I can rebuild
a machine in a couple hours.

But from servers, I definitely backup many specific subfolders in /var/ -- e.g.
/var/lib/bind # if running bind
/var/lib/mailman # if running mailman
/var/spool/cron/crontabs/
/usr/local/


For MySQL or PostgreSQL or LDAP, I use the tools automysqlbackup,
autopostgresqlbackup and autoldapbackup respectively, which create versioned
local dumps on the filesystem, and then add the daily backup directory to
rsnapshot to have remote backups.


Maybe I should blog about my setup in more detail... let me know if you have any
other questions! I'm just shy of 10 years using this backup infrastructure, and
it's served me alright.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://gtalug.org/pipermail/talk/attachments/20161227/2292722a/attachment.sig>


More information about the talk mailing list