Search This Blog

Wednesday, 4 February 2015

Moment-in-time (snapshot) backup of Amazon AWS / EC2 instances

This post is the second in my series on backing up servers in the cloud. (The previous post was here: http://blog.ifost.org.au/2015/01/using-data-protector-to-back-up-your.html.) Obviously, there are some servers in the cloud that you simply won't need to backup, but the remainder which you do need to backup are much harder.

The big problem (which I will address in the next post in this series) is the cost of long-term storage. If you want to maintain backups for seven years, even Amazon Glacier will be absurdly expensive.

 The other problem is the challenge of getting a consistent moment in time backup. Only the very minor cloud infrastructure service providers are offering VMware or Hyper V as their main offering, so virtual disk snapshot backups aren't an option.

On Windows systems there is VSS, so a file system backup taken with a VSS snapshot option will be a reasonably consistent moment in time.

However, most servers run Linux. Many of them are running a database of some sort (PostgreSQL or MySQL). It is possible to arrange a pre-exec to dump the database to disk, but this gets increasingly impractical as the database gets larger.

I wrote a blog article about one method (LVM) to get consistent Linux backups last year (http://blog.ifost.org.au/2014/07/moment-in-time-snapshot-backups-of.html ). But with Amazon  it is possible to use the snapshot capability they provide.

The scripts on this page assume that your instance has one EBS volume that it boots from, and no other attached storage.

Create a backup job that will backup /mnt (even though the server doesn't have a mounted filesystem there). If necessary, just edit the data list:

FILESYSTEM "/mnt" cloud-server1.data-protector.net:"/"
{
}
Then put the following two scripts (snapshot-preexec.sh and snapshot-postexec.sh) into /opt/omni/lbin on the cloud hosted server, and modify the backup job to use these as the pre- and post-exec jobs for the backup.

If the Amazon EC2 tools are not already installed and configured in the environment, you will need to install them. Most of the Amazon-supplied AMIs have these already in-place, but the Redhat-supplied ones don't.

snapshot-preexec.sh

#!/bin/sh

# First, get our instance ID from Amazon
INSTANCE=$(wget -q -O - \
   http://169.254.169.254/latest/dynamic/instance-identity/document \
    | grep instanceId | cut -d'"' -f4)

# Next, find out zone we are in, for the volume creation later
ZONE=$(wget -q -O - \
   http://169.254.169.254/latest/dynamic/instance-identity/document \
    | grep availabilityZone | cut -d'"' -f4)

# This script only works for single volumes at the moment
VOLUME=$(ec2-describe-instances $INSTANCE \
   | grep BLOCKDEVICE | awk '{print $3}' | head -1)

# Flush everything we can out to disk before we take the snapshot
sync
sync
fsfreeze -f /

# Create a snapshot of our root volume
SNAPSHOT=$(ec2-create-snapshot $VOLUME | awk '{print $2}')
until ec2-describe-snapshots $SNAPSHOT | grep -q completed
do
    sleep 1
done

fsfreeze -u /

# Turn that snapshot into a volume
NEWVOL=$(ec2-create-volume --snapshot $SNAPSHOT -z $ZONE| awk '{print $2}')
until ec2-describe-volumes $NEWVOL | grep -q available
do
    sleep 5
done

# Connect that volume
ec2-attach-volume $NEWVOL -i $INSTANCE -d sdf
until ec2-describe-volumes $NEWVOL | grep -q attached
do
    sleep 5
done

# Mount it
mount /dev/xvdf /mnt

# Now we can back up. Remember what we had though
echo $SNAPSHOT > .snapshot-to-remove
echo $NEWVOL > .volume-to-remove


The first couple of lines will only work on the Amazon cloud. The EC2 instance queries a special Amazon address to find out its own details -- its instance id (e.g. i-121255) and its zone (e.g. us-west-2c).

Then we flush as much out to disk as we can (with the sync commands, and then freeze I/O on the root filesystem). Any thing that tries to write to disk will block until after the snapshot is completed. Read operations will still work. We run a busier loop checking to see if the snapshot is ready.

After that, we turn the snapshot into a volume, and attach that volume to a device which will probably be free. There will be an error in dmesg about the lack of a partition table on /dev/xvdf but it doesn't seem to matter.

Finally we mount /mnt (ready to be backed up) and remember what volumes we just created.


snapshot-postexec.sh

#!/bin/sh

SNAPSHOT=$(cat .snapshot-to-remove)
NEWVOL=$(cat .volume-to-remove)

umount /mnt

# Detach the volume and wait until it is gone
ec2-detach-volume  $NEWVOL
while ec2-describe-instances $NEWVOL | grep -q ATTACHMENT
do
    sleep 5
done

ec2-delete-volume $NEWVOL
ec2-delete-snapshot  $SNAPSHOT
After the backup, the post-exec removes the mount, the volume and the snapshot.

I hope you find this helpful.


Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at http://www.ifost.org.au/dataprotector . He has written numerous books (see http://www.ifost.org.au/press ) on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.