Computing devices and online services can fail catastrophically and take our data with them. It is crucial that we have a robust system to backup and restore our data, to protect against such events. This post details what I wanted from the backup system for my personal data and the tools I use to achieve them. This system has served me well over the last 5 years, across fat-fingerings and disk failures.
I wanted my backup system to meet the following criteria.
I do not care about backing up directly from my mobile. I use Syncthing to sync the directories I care about, like Camera, from my mobile to my desktop.
I use Restic and Rclone to manage my backups. Restic handles chunking, encryption, deduplication and versioning. Rclone handles syncing the Restic repository to cloud storage. Rclone supports a large number of cloud storage providers and Restic can access them through Rclone. Between these, I’m almost completely cloud provider agnostic.
I created a Restic repository in an external Hark Disk Drive (HDD) and wrote a shell script that does the following.
--prune
flag is provided.#!/bin/bash -e
if [ `whoami` != "root" ]; then
echo "This command must be run as root." 1>&2
exit 1
fi
read -s -p 'Enter Password: ' RESTIC_PASSWORD; export RESTIC_PASSWORD; echo
cskr_home=/home/cskr
user2_home=/home/user2
primary=/media/cskr/backup
cp -a $cskr_home/data/personal.kdbx $primary
for i in $cskr_home $user2_home; do
echo "Backing up $i/data..."
restic --quiet --repo $primary/personal backup --exclude 'lost+found' $i/data
echo "Forgetting expired snapshots of $i/data..."
restic --quiet --repo $primary/personal forget --path $i/data --keep-within-daily 7d --keep-within-weekly 1m --keep-within-monthly 6m
done
if [ "$1" = '--prune' ]; then
echo "Pruning the repository..."
restic --quiet --repo $primary/personal prune
fi
echo 'Syncing to B2...'
rclone -P sync $primary/personal b2:cskr-backup-personal
echo 'Syncing Passwords to Google Drive...'
rclone -P copy $cskr_home/data --include personal.kdbx gdrive:
While Restic can backup directly to B2, I found it to be extremely slow. It also used more chargeable B2 requests than syncing with Rclone. These were my observations about 5 years ago. It may be better now. I’m sticking to Restic+Rclone as it is nice to not chunk, encrypt, and deduplicate twice.
As it stands, I need to plug the HDD in and execute the script to take a backup. This has not been a problem as I do it anytime I add large amounts of data, say photos from a vacation. I also managed to remember to execute the script fairly regularly, even if there is no large change in data. This is not ideal and you may want to automate backups. This requires solving 2 problems.
Firstly, your automation needs access to the password to your Restic
repository. If you use a systemd timer, you can use the EnvironmentFile
key
to set the RESTIC_PASSWORD
environment variable. Ensure that the access to
this file is appropriately restricted. Alternatively, recent versions of
systemd (250+) support reading the credentials from an encrypted file,
with the key stored in TPM. This makes securing your Restic password easier.
Secondly, you need to work around having to plug the HDD in for backup. You can leave it plugged in, but that’s not convenient. As a compromise, the scheduled execution can perform the backup directly to B2, bypassing the HDD. While this compromises the 3-2-1 rule, it will ensure that you have at least one backup even if you forget to do it manually. You can continue to execute the script manually when you remember. Your HDD and B2 will get synchronized when you do that, but the automated snapshots made directly in B2 will be lost. You can also modify the script to sync the HDD from B2, instead of the other way around. That’ll solve the problem of lost snapshots, but will incur a cost for the download.
Your backup is only as good as your ability to recover from it. With a system
like this, you should test recovery both from HDD and the cloud storage
provider. When recovering from B2 for testing, I restore a single file using
--include
, to avoid downloading all the data. I also run restic check
occasionally, against both the HDD and B2.
All opinions are my own. Copyright 2005 Chandra Sekar S.