Rclone is a handy little console utility that lets you work with cloud storage like you're using rsync
. It's got a very long list of providers that it supports, and it'll let you setup multiple different storages of the same or different types. It offers encrypted storage on top of any other storage setup you have. And, you can even mount each storage as a virtual drive on Linux, Mac OS, or Windows. For most cases, this will be a pretty good option for backing up whatever data you have, wherever you'd like to put it.
In my case, I'll be using rclone
to backup everything to B2, since I'm also using BackBlaze's service to keep my desktops backed up.
If you look in the documentation, there's a couple of different commands that might fit what we're wanting to do: rclone copy
and rclone sync
. rclone copy
will copy all files. rclone sync
will delete files on the destination that have been deleted. It's probably a good idea to be a little cautious deleting files on our backup storage, so I think I'm going to set things up using copy
, and then handle deleting a bit more strategically later on. You'll want to sit down and take a look at your needs and your comfort level before making a decision for yourself, but copy
is probably the safer of the two.
For my specific situation, deleting files on the destination is alright, since the destination is set up to preserve previous versions of files for an amount of time. This might not be the case for you. You'll want to make sure that you want to use sync
instead of copy
.
Originally, this section was going to talk about a way to use the check
and delete
options together to get a list of files that have been deleted and are older than 30 days, but when I was testing that method out, it turned out that it was taking a very long time because I'm using an encrypted remote. Thankfully, the B2 storage that I'm using has a nice built in way to handle this: Lifecycle Rules.
By default, files aren't deleted from B2 when using rclone
. Instead, the files are "hidden". This means that the file looks like it's deleted, but if you go to pull up the previous versions, those previous versions will show up. If you want to delete files explicitly, you need to use the --b2-hard-delete
option. However, you can go into your bucket's settings on the Backblaze website, and tell it how long to keep previous versions around. By default, B2 will store previous versions forever, but in this use case, I think I can allow it to permanently delete any previous versions that have been around longer than 30 days.
If all of the folders that you want to copy are in the same place, you can use the --include
option to specify each folder inside the one you're in you'd like to copy over. If it's more than one or two, you can also put that list in a text file, and use the --include-from
to specify that text file. Both options require each item to consider the directory the transfer is happening from as their root, and need to specify all of their files.
For example, if I have a folder called /share
and I want to copy the folders documents
, music
, and videos
, then I'd need to use either rclone copy /share/ [DESTINATION] --include "/documents/**" --include "/music/**" --include "/videos/**"
or, I could just use rclone copy /share/ [DESTINATION] --include-from folders.txt
and have a file with the following lines:
/documents/**
/music/**
/videos/**
My destination is going to be using BackBlaze B2, and there are some limits on transactions, so I'd like to keep them to an absolute minimum. There's an option that will use more memory, but won't have to do as many listings. Since I've got a ton of ram on this box now, this is a trade off I'm more than happy to make. That option is --fast-list
.
Because I'm going to be using those modification times to check for when to delete files from the backups, it's important that they get updated any time that we can. Using the --refresh-times
option should force the copy to always update the times modified.
With all of that, it looks like there's going to be 3 commands that need to run in sequence:
rclone copy [SOURCE] [DESTINATION] --include-from folders.txt --fast-list --refresh-times
I'm used to cron. I've used cron before. Am I going to use cron?
No, because systemd runs the rest of my server, and it can even just do cron's stuff for you if you install the systemd-cron package. It also lets me control everything just like it's a regular service, and honestly that's kinda nice in that all of my management tasks are becoming unified under a single syntax. I like that.
One thing to keep in mind, though, is that systemd is usually a system service, and tends to require root or sudo access in order to work correctly. While that might be fine, I want to use the config that I already have in my user folder, and it'll be a lot easier to just have systemd run as if it's me. Now, it'll only run as me while I'm actively logged into the machine, but you can have systemd have your user instance run all the time, which is probably what I want to do anyway. This is a feature called "linger", and can be enabled with the command loginctl enable-linger [USERNAME]
.
In order to use a user service, we'll need two files: a service file, and a timer file. Both of these files are by default going to be in ~/.config/systemd/user/
, and be named with backup.*
.
We'll start with the backup.service
file:
[Unit]
Description=Backup
After=network.target
[Service]
Type=oneshot
WorkingDirectory=[SOURCE]
ExecStart=/usr/bin/rclone copy [SOURCE] [DESTINATION] --include-from folders.txt --fast-list --refresh-times
[Install]
WantedBy=default.target
This file defines a service, called "Backup", that will start up after networking. It's a oneshot service, so it's a command that just executes, and that's it; it won't stay running in the background. It's set to use the folder we're using as a source for it's working directory, and then has the rclone command that we're going to be using, and options to include the list of folders that I'd like, use the more memory intensive file listing, and go ahead and update the times on any files that are uploaded.
Now that we're done editing the file, we can go ahead and have systemd reload it's config files with systemctl --user daemon-reload
.
Now that we have a service, we can try it out. This is our big benefit from separating this out between a service and a timer. You can go ahead and test out the service with systemctl --user start backup.service
. The first time that you run this command, it'll probably end up taking a lot of time to do the initial backup. You can check the output by using journalctl -xeu --user-unit backup.service
. If we run into any errors, we can try to resolve those now, but if we don't run into any errors, we're ready to move onto the timer.
Now, we can go ahead and set up a timer file, that will run our service automatically when we specify. We'll make a file called backup.timer
in the ~/.config/systemd/user/
folder.
[Unit]
Description=Run backup every day
[Timer]
OnCalendar=daily
Persistent=true
[Install]
WantedBy=timers.target
The important part of this file is the [Timer]
section. It specifies how often the service should be run. We actually have a couple of options for how we'd like to run the timer: relative to when the system started (these are called "monotonic" timers), or we can have timers relative to the actual time (these are called "calendar events"). The calendar event is more what I'm used to, and would prefer to do. The word daily is a shorthand for "Every day at midnight", which is perfectly fine for my situation. You can find more info about scheduling in the documentation for systemd timers and systemd time.
Another small detail here is that I'm using the Persistent=true
option, so that if the computer is off when it would normally run the backup, it'll run it when it starts up. You may or may not want this option, depending on what you're using the computer for. Since this is a file server in my house, I'm alright with it running the backup immediately if I have it down when it normally would.
After we're done editing this file, we'll need to reload systemd's config files again with systemd --user daemon-reload
.
Finally, we're at the point where we can turn on our timer. All we need to do in order to do that is run the command systemctl --user enable backup.timer
. This will enable the timer and have it run automatically based on how we've scheduled it. If we want to go ahead and run it right now, just to check everything out, we can add the --now
option. We can check on the status of our timer by using the command systemctl --user status backup.timer
.