The big problem (which I will address in the next post in this series) is the cost of long-term storage. If you want to maintain backups for seven years, even Amazon Glacier will be absurdly expensive.
On Windows systems there is VSS, so a file system backup taken with a VSS snapshot option will be a reasonably consistent moment in time.
However, most servers run Linux. Many of them are running a database of some sort (PostgreSQL or MySQL). It is possible to arrange a pre-exec to dump the database to disk, but this gets increasingly impractical as the database gets larger.
I wrote a blog article about one method (LVM) to get consistent Linux backups last year (http://blog.ifost.org.au/2014/07/moment-in-time-snapshot-backups-of.html ). But with Amazon it is possible to use the snapshot capability they provide.
The scripts on this page assume that your instance has one EBS volume that it boots from, and no other attached storage.
Create a backup job that will backup /mnt (even though the server doesn't have a mounted filesystem there). If necessary, just edit the data list:
Then put the following two scripts (snapshot-preexec.sh and snapshot-postexec.sh) into /opt/omni/lbin on the cloud hosted server, and modify the backup job to use these as the pre- and post-exec jobs for the backup.FILESYSTEM "/mnt" cloud-server1.data-protector.net:"/" { }
If the Amazon EC2 tools are not already installed and configured in the environment, you will need to install them. Most of the Amazon-supplied AMIs have these already in-place, but the Redhat-supplied ones don't.
snapshot-preexec.sh
The first couple of lines will only work on the Amazon cloud. The EC2 instance queries a special Amazon address to find out its own details -- its instance id (e.g. i-121255) and its zone (e.g. us-west-2c).#!/bin/sh # First, get our instance ID from Amazon INSTANCE=$(wget -q -O - \ http://169.254.169.254/latest/dynamic/instance-identity/document \ | grep instanceId | cut -d'"' -f4) # Next, find out zone we are in, for the volume creation later ZONE=$(wget -q -O - \ http://169.254.169.254/latest/dynamic/instance-identity/document \ | grep availabilityZone | cut -d'"' -f4) # This script only works for single volumes at the moment VOLUME=$(ec2-describe-instances $INSTANCE \ | grep BLOCKDEVICE | awk '{print $3}' | head -1) # Flush everything we can out to disk before we take the snapshot sync sync fsfreeze -f / # Create a snapshot of our root volume SNAPSHOT=$(ec2-create-snapshot $VOLUME | awk '{print $2}') until ec2-describe-snapshots $SNAPSHOT | grep -q completed do sleep 1 done fsfreeze -u / # Turn that snapshot into a volume NEWVOL=$(ec2-create-volume --snapshot $SNAPSHOT -z $ZONE| awk '{print $2}') until ec2-describe-volumes $NEWVOL | grep -q available do sleep 5 done # Connect that volume ec2-attach-volume $NEWVOL -i $INSTANCE -d sdf until ec2-describe-volumes $NEWVOL | grep -q attached do sleep 5 done # Mount it mount /dev/xvdf /mnt # Now we can back up. Remember what we had though echo $SNAPSHOT > .snapshot-to-remove echo $NEWVOL > .volume-to-remove
Then we flush as much out to disk as we can (with the sync commands, and then freeze I/O on the root filesystem). Any thing that tries to write to disk will block until after the snapshot is completed. Read operations will still work. We run a busier loop checking to see if the snapshot is ready.
After that, we turn the snapshot into a volume, and attach that volume to a device which will probably be free. There will be an error in dmesg about the lack of a partition table on /dev/xvdf but it doesn't seem to matter.
Finally we mount /mnt (ready to be backed up) and remember what volumes we just created.
snapshot-postexec.sh
After the backup, the post-exec removes the mount, the volume and the snapshot.#!/bin/sh SNAPSHOT=$(cat .snapshot-to-remove) NEWVOL=$(cat .volume-to-remove) umount /mnt # Detach the volume and wait until it is gone ec2-detach-volume $NEWVOL while ec2-describe-instances $NEWVOL | grep -q ATTACHMENT do sleep 5 done ec2-delete-volume $NEWVOL ec2-delete-snapshot $SNAPSHOT
I hope you find this helpful.
Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at http://www.ifost.org.au/dataprotector . He has written numerous books (see http://www.ifost.org.au/press ) on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.
No comments:
Post a Comment