I feel better, now
Oops! The whole problem started when I wanted to install a new program on the “entertainment computer” that’s connected to the TV. The existing version of Ubuntu was no longer supported, so I started what I thought would be a simple upgrade. Oops, it deleted the entire /var directory, including the contents of the second drive mounted there. That drive contained all of our photos and music. Not a problem, I thought. It’s all backed up nightly onto a USB drive. It’s just a bit of bother to copy all of that over the USB interface.
Oops! There are no photos beyond last March. I felt sick to my stomach. That’s a lot of family memories to evaporated. Whew! The program we’ve been using to download the cameras also makes a backup copy local to the machine used to download. I’ve got all the pictures back on the photo server, and now just have to sort them into directories again.
But first, I need to fix the backup script. Why didn’t it warn me when it started to fail?
I copied a version on a different fileserver to that machine and edited it for the different source and destinations. It’s a simple shell script that mounts the USB drive, runs
rsync to duplicate the data, and then unmounts the USB drive. I’d run into a problem once before where the drive wasn’t getting mounted properly, so I’d added some script to detect that and email me using
msmtp if it failed. That only worked for mount errors, though, not for
As I hacked in detection and notification of
rsync errors, I felt ill at ease. I was duplicating some of the script to send emails. The script was getting harder to read. The script was increasingly inconvenient to test. And I was nervous about introducing new errors.
I started practicing Test Driven Development a decade ago. It has become second-nature to me to test-drive the code I write. Why was this code just hacked together? Mostly because I hadn’t been thinking of it as code. It was just a script. At first, it just replicated the commands I typed on the command-line. I put those into a script to make it easy to run, and then to automate running it via the
cron daemon. When I replicated it to another machine, I extracted the paths to variables to make it easier to change the differences. When it had failed to mount a drive, I’d added a check for that and a line to send me an email. That line surely crossed the border between “a file of commands” to “a program,” but I didn’t notice it at the time. Or, if I did notice it, I didn’t pay attention to it. After all, I don’t generally program in bash scripts and haven’t developed any rigor for doing so. It’s those “little” changes that “can’t possibly” hurt that bite you the hardest.
The script now “seems to be working,” but I’m not satisfied. Wouldn’t it be nice to get advance notice when the disk is getting close to full, rather than waiting until it is?
I now know I’m programming in
sh, not just collecting a bunch of command-lines from my
.history file. Adding the email notification of
rsync errors was harder than I expected due to tiny errors made in an unfamiliar programming environment. In addition to being hard, it just felt “dirty.”
A little googling lead me to
shunit2. After a little difficulty creating a tiny virtual disk under control of the tests, I quickly had a function that extracted the % full output using the
du command and returned it as a number. Great! Now I can use that number to provide an early warning. And working test-driven is teaching me better shell scripting habits.
I’ve still got work to do, pulling the existing script into tested functions, but already I feel better. I’m on my way to a reliable and maintainable script.