I manage over a dozen, relatively small Web sites, including a few blogs and wikis, as well as content managed Web sites that are hosted in both PHP/MySQL and Zope/MySQL environments. In general, I work in the LAMP architecture, and use several different hosts, each one configured differently (especially when it comes to how MySQL is installed). At two of the hosts, MySQL is installed on the same server as the Web site, while another has it installed on a separate server. I can’t connect remotely to MySQL in either configuration; connections to the MySQL server have to be made from the machine the Web server is on.
Given this, I wanted a way to make automated backups of these sites that was easy to implement and maintain, easily understandable, and consistent across my hosts and server application architectures. I have explored a number of alternatives, including some rather large server based approaches, but they all seemed deficient in some way.
So, I ultimately decided to glue together mysqldump and wget with some bash scripts and cron as a lightweight way to make backups that could easily be adapted for my different hosting configurations and different software architectures, now and in the future, while still being more or less consistent. As a bonus, this approach is implemented almost entirely on the machine I use for development, so there is very little that I need to install and configure on the host machine for each site. This post explores this approach; if you have a similar need, you may find it helpful.
I will begin with how I use the familiar mysqldump utility to make backups of remote MySQL databases that I don’t have the ability to log into from my development machine. Save the following script on your Web server as mysqldump.php, testing it progressively as indicated to implement it most quickly:
<?php// STEP 1: through the host’s control panel app, find the absolute path to the// document root. Or if this information can’t be found this way, add// phpinfo.php file with <?php phpinfo(); ?> in it to report the information.// STEP 2: make sure you have access to the mysqldump command//echo shell_exec(‘which mysqldump’);// STEP 3: make sure you are going to get the output you expect// echo shell_exec(‘mysqldump -ctn -hlocalhost -uusername -ppassword database’); // STEP 4: execute and save fileecho shell_exec(‘mysqldump -ctn -hlocalhost -uusername -ppassword database > /web/server/document/root/backup/database_backup.sql’);?>
This script, which is ultimately implemented as one command, is the only part of this backup approach that lives on the server.
Let’s do a quick walk through the code. To implement this, you will individually uncomment and run the code in each step while leaving the others lines commented. This will allow you to query and test your environment to insure that this script will work as expected. In step 1, we are looking for where we are on the file system, so we know where to store the generated SQL code later on. Step 2 insures that we have access to the mysqldump utility on the server, and step 3 insures that we get the SQL we want from it, basically testing our login. The final code in step, 4, is what we want to run to output our SQL to a place we can safely retrieve it. This one command is the only server portion of this backup approach.
It is not typically necessary to protect this script, although it wouldn’t hurt either, especially if your database is larger. A more important consideration is to save the generated SQL code to a location on the filesystem outside the Web root, but still accessible by FTP, or, if your host doesn’t allow you to do this, store it in a protected directory below the Web root.
Here is a brief explanation of the mysqldump flags I used:
- -c (generates complete insert statements)
- -t (does not output create table info)
- -n (does not output create database statement)
For a complete explanation of mysqldump’s options, look here (5.1 version). I chose these options anticipating the scenario in which I would need to reinstall an entire application from scratch and import this backed up data as the final step. If you are more concerned about a corrupted database, you might want to output create database and create table commands also, however you can usually find the code to create the initial database schema easily in a project’s distribution, and you can keep a back up of that code after your initial install is complete for the purpose of recreating the database structure, so I still prefer this approach ultimately.
Then, you can use the following script, that uses wget over FTP to grab this file and any other crucial files that should be backed up. So, for WordPress for example, a typical file may look like this:
# tickle the mysqldump.php backup scriptwget -q –delete-after http://www.domain.com/backup/mysqldump.php# retrieve the mysqldump outputwget -m -a /path/to/local/backup/directory/backup_log.txt -P /path/to/local/backup/directory/www.domain.com http://user:[email protected]/backup/database_backup.sql # get installed themes and plugins as well as user uploadswget -m -a /path/to/local/backup/directory/backup_log.txt -P /path/to/local/backup/directory/www.domain.com/wp-content ftp://user:[email protected]/httpdocs/blog/wp-content
The first command tickles the mysqldump.php script on the server, as if you had accessed it from a browser, without downloading anything. This way, we don’t have to rely on synchronizing with a cron job on the server, especially since many PHP/MySQL hosting environments have sketchy cron support, in my experience. Doing this insures that we will have a current backup file for the second command to download. Finally, the third command makes an FTP connection to download the entire wp-content directory, grabbing all the files that implement the plugins and themes, and grabbing the user upload directory too.
The wget utility is an amazingly powerful tool and is incredibly useful in many situations. The Linux in a Nutshell entry for wget is probably the best reference I have seen for its various options, and will give you plenty of ideas for how to use this valuable tool.
I created a small script like this for each installation I wanted to backup. I kept the scripts in the same place, and backed up each installation to its own subdirectory off of a common directory also. If you have many of the same types of applications, say you need to back up a number of WordPress blogs, you may benefit from having a separate script that acts as a template that you either customize for each new script you need, or which is more generalized so that you only need to pass a few pieces of information to it to backup a new WordPress installation. You probably want to store these scripts together and then add each script to your crontab file indicating when to do the backups. I prefer to stagger my backups and so, I have a separate entry for each individual script, but you can also group these scripts together under one or more other scripts that are called from cron to execute them sequentially at one time. Again, configure this in a way that makes sense for you.
For my Zope environments, I wrote the following, somewhat generalized, script and saved it as zope_backup.sh:
#!/bin/bash# usage:# zope_backup.sh <subdomain to backup>if [$1 = “”]then echo “you must specify a zope site to backup” exitfi name=$1username=usernamepassword=passwordhost=www.domain.compath=path/in/zope/to/subdomains/folder/$namescript=manage_exportObject?download:int=1root=/path/to/local/backup/directory/directory=$root/$nameoutput=backup_log.txtwget -N -c -nH -S -a $root/$output -P $directory –progress=dot:mega –waitretry=120 http://$username:$password@$host/$path/$script cd $directoryif [ -a $script ]then mv $script $name.zexp if [ -a $name.tar.gz ] then if [ -a previous_$name.tar.gz ] then rm previous_$name.tar.gz fi mv $name.tar.gz previous_$name.tar.gz fi tar -cvzf $name.tar.gz $name.zexp rm $name.zexpelse echo “doesnt exist”fi
For example, if a site called www.mysite.com is mapped to a Zope folder called ‘mysite’, you would call the script like this:
./zope_backup.sh mysite
When backing up from Zope, you are exporting data from its default object database using Zope’s built-in manage_exportObject script. For added safety, when I make a backup, I also temporarily archive the most recent backup, so if something goes wrong, I have a chance of at least having a slightly older version of the data. This is probably just extra paranoia on my part because we aren’t talking about normal file transfers over FTP, but then again, if we weren’t paranoid, we’d be relying on our host’s backups, wouldn’t we?
The crontab file with all the scripts, might look like this:
# the following has been shamelessly stolen for convenience from:# http://www.adminschoice.com/docs/crontab.htm# * * * * * command to be executed# – – – – -# | | | | |# | | | | +—– day of week (0 – 6) (Sunday=0)# | | | +——- month (1 – 12)# | | +——— day of month (1 – 31)# | +———– hour (0 – 23)# +————- min (0 – 59)01 05 * * * /home/ajs17/backup.sh# monthly zope backups01 01 01 * * /path/to/local/backup/directory/zope_backup.sh mysite15 01 01 * * /path/to/local/backup/directory/zope_backup.sh yoursite# weekly zope backups01 04 * * 3 /path/to/local/backup/directory/zope_backup.sh thissite30 04 * * 3 /path/to/local/backup/directory/zope_backup.sh thatsite# my blog01 03 * * 2 /path/to/local/backup/directory/myblog_backup.sh
So that is essentially it. We are using existing tools we are already familiar with, to do exactly what those tools were written for: mysqldump to output SQL, wget to retrieve it across the network, and a small shell script to glue them together. We use cron to configure when they are run, and then we forget about it….well almost.
Its a good idea to test backups on the command line before relying on cron, to make sure you end up getting what you expect, even if you are using a generalized script. You also want to check back on your cron jobs when you initially configure them or modify the time they are run, to be sure you edited cron correctly. Its also a good idea to periodically test backups to make sure you are still getting what you expect, in case your host decides to make a minor change to the server environment that breaks this, or you make an equally minor yet catastrophic local change. For things like the WordPress backup, a visual inspection may be enough. You can open an SQL file and tell if its valid, and you can simlilary inspect other backed up files and their timestamps to see if anything looks amiss, or compare it to a view of the corresponding remote folder in an FTP client.
For Zope backups, however, you may want to import the *.zexp files you download into a local zope installation to be sure that it successfully imports and looks right.
Obviously, this approach is light on the recovery side of the backup/recovery equation, and if you can’t recover from your backups, those backups are worthless. But by using familiar methods, like mysqldump and FTP, you can easily reverse the automated backup process manually to quickly recover your site. The key is to make sure you are backing up everything you need to get your site back. We all think of our MySQL data, but what about your plugins, themes, etc.?
But then again, you may not have FTP access to your site, either because your host doesn’t provide it (rare) or because you turned it off for security reasons. In this case, I think this approach can be adapted to use scp instead of FTP. You may also want to explore using rsync. That, as they say, is an exercise for the reader.
And, if you have a large MySQL database, mysqldump may not be a desirable way to backup your data either. Again, by refer to the MySQL documentation, to find a way to adapt this approach to fit your needs. However, I think mysqldump is adequate for most needs.
Finally, just to show how committed I am to this lightweight shell scripting approach to backups, I present one last code block that I use to backup files on my development machine every night. The files I back up locally include the usual suspects: photos from my digital camera, programming code, office documents, and of course my remote server backups. I compress all the files and back them up to a high capacity USB drive. But, as I did with my Zope backups, before overwriting an existing backup, I copy it back over to a separate location on the hard drive as an extra safe guard. Then, usually on a weekly basis, I make a slightly more permanent backup of important files to CD, or if you have the hardware and need the extra space, to DVD:
#!/bin/bash# ———————————————–# (partial) list of backup files# ———————————————–files[1]=server_backupsfiles[2]=digital_camerafiles[3]=programming_codefiles[5]=.bash_profile# ———————————————–from=/path/to/my/documents/to=/media/usbdisk/previous_backup=/path/to/my/previous/backups/report_file=/path/to/my/documents/backup_report.txtfor i in “${files[@]}”do mv -fv ${to}${i}.tar.gz ${previous_backup} tar -czvf ${to}${i}.tar.gz ${from}${i}donedf -B MB ${to} >> ${report_file}
Some things that you may want to backup that may not occur to you at first:
- your .bash_profile, .cshrc or whatever applies to you,
- your crontab configuration, and
- your backup scripts, including the very one executing…
I am interested in feedback on this approach, including suggested improvements as well as alternatives.