Search This Blog

Tuesday, 24 February 2015

Planning, Deploying and Installing Data Protector 9: For the datacentre, the cloud and remote offices

This is the fourth book I've published on HP Data Protector, and I think it will be a very valuable resource to any sysadmin or consultant working with HP products.

If you are involved in pre-sales or implementation consulting you will find lots of useful material in this book.

  • Why customers choose Data Protector, and what its strengths and weaknesses are.
  • Step-by-step, everything you would need in order to backup remote branch offices.
  • Very detailed designs on how you can backup Amazon-hosted cloud servers.
  • How to backup data centres filled with nearly identical virtual machines.
  • Guidance on handling media pools, tape drives and other physical devices.
  • Doing disk-to-disk-to-tape backups.
  • Very detailed information on the internal database that simply isn't available elsewhere -- the results of my reverse engineering what HP has done. If you have complicated reporting needs, this is the book to have.
If you are a system administrator / backup administrator, you will find this book a welcome alternative to the HP documentation. I talk honestly about bugs that I've found, what makes sense to do, and what doesn't, and more importantly why certain designs make more sense than others.

This book can also be used as a course book if you want to run a one-day training session. All the labs can be run up in the cloud.

To keep the price down on the paperback version, it's a B&W print for all 374 pages. The Kindle version is full-colour if you have a full-colour device.

The paperback version has a comprehensive index to make this into an ideal reference to have at your desk. It is enrolled in matchbook, so you can have both the paper version and the kindle version for only slightly more than the paper version. Buy both!

Topics covered:

  • Reasons customers choose Data Protector 
  • Areas of comparative weakness 
  • Installing Windows and Linux cell managers 
  • Agent deployment methods 
  • StoreOnce de-duplication 
  • Backing up filesystems 
  • Backing up VMware 
  • Backing up cloud-hosted servers 
  • Configuring tape drives 
  • Managing media 
  • Backups spooled via disk 
  • Remote office backups replicated to a central office 
  • Reporting 

There is also a companion book on Data Protector for operators with a topics around restoring, supporting and maintaining.

Buy it on Amazon ( ), or if you want to use an alternate store, they are listed here:

Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at He has written numerous books (see on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.

Thursday, 19 February 2015

Operating, running and supporting HP Data Protector 9

A new (Chinese) year, and I've written two new books on Data Protector.

The first is a book for operators.

  • If you are a sysadmin, and a new staff member who has just joined your organisation and needs to keep an eye on the backups. Buy this book for them, it has everything they need to get started.
  • If you are a consultant and you have just finished an implementation, you will need to provide some hand-over documentation. Buy this book for them, and you might only need to write a page or two of site-specific documentation. Everything else they will need is in this operators book.
There's an ebook version if you need something very cheap in a hurry; and there is also a beautifully presented full-colour on-paper version, which is more expensive, but makes a big impact. 

Topics include:
  • What a cell manager is; what an installation server is
  • How to install a Data Protector console
  • Using Manager.exe and what different connection error messages mean
  • Monitoring backup sessions
  • Reviewing backup history
  • Responding to day-to-day issues (mount requests, errors, tape busy)
  • Formatting tapes
  • Logging a support call with HP and collecting debug logs
  • Restoring filesystems
  • Restoring sessions
  • Restoring individual files
  • Restoring virtual machines
  • Restoring individual files from virtual machines using the VMware granular recovery extension

The paperback version also includes:
  • A page where you can record the name of the cell manager, where the cell console is, and any installation servers.
  • An extensive index
The e-book is available for pre-order today (with delivery next week), the dead tree version should be available by Monday. As with everything else I have written, you can find it a

Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at He has written numerous books (see on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.

Wednesday, 4 February 2015

Moment-in-time (snapshot) backup of Amazon AWS / EC2 instances

This post is the second in my series on backing up servers in the cloud. (The previous post was here: Obviously, there are some servers in the cloud that you simply won't need to backup, but the remainder which you do need to backup are much harder.

The big problem (which I will address in the next post in this series) is the cost of long-term storage. If you want to maintain backups for seven years, even Amazon Glacier will be absurdly expensive.

 The other problem is the challenge of getting a consistent moment in time backup. Only the very minor cloud infrastructure service providers are offering VMware or Hyper V as their main offering, so virtual disk snapshot backups aren't an option.

On Windows systems there is VSS, so a file system backup taken with a VSS snapshot option will be a reasonably consistent moment in time.

However, most servers run Linux. Many of them are running a database of some sort (PostgreSQL or MySQL). It is possible to arrange a pre-exec to dump the database to disk, but this gets increasingly impractical as the database gets larger.

I wrote a blog article about one method (LVM) to get consistent Linux backups last year ( ). But with Amazon  it is possible to use the snapshot capability they provide.

The scripts on this page assume that your instance has one EBS volume that it boots from, and no other attached storage.

Create a backup job that will backup /mnt (even though the server doesn't have a mounted filesystem there). If necessary, just edit the data list:

FILESYSTEM "/mnt""/"
Then put the following two scripts ( and into /opt/omni/lbin on the cloud hosted server, and modify the backup job to use these as the pre- and post-exec jobs for the backup.

If the Amazon EC2 tools are not already installed and configured in the environment, you will need to install them. Most of the Amazon-supplied AMIs have these already in-place, but the Redhat-supplied ones don't.


# First, get our instance ID from Amazon
INSTANCE=$(wget -q -O - \ \
    | grep instanceId | cut -d'"' -f4)

# Next, find out zone we are in, for the volume creation later
ZONE=$(wget -q -O - \ \
    | grep availabilityZone | cut -d'"' -f4)

# This script only works for single volumes at the moment
VOLUME=$(ec2-describe-instances $INSTANCE \
   | grep BLOCKDEVICE | awk '{print $3}' | head -1)

# Flush everything we can out to disk before we take the snapshot
fsfreeze -f /

# Create a snapshot of our root volume
SNAPSHOT=$(ec2-create-snapshot $VOLUME | awk '{print $2}')
until ec2-describe-snapshots $SNAPSHOT | grep -q completed
    sleep 1

fsfreeze -u /

# Turn that snapshot into a volume
NEWVOL=$(ec2-create-volume --snapshot $SNAPSHOT -z $ZONE| awk '{print $2}')
until ec2-describe-volumes $NEWVOL | grep -q available
    sleep 5

# Connect that volume
ec2-attach-volume $NEWVOL -i $INSTANCE -d sdf
until ec2-describe-volumes $NEWVOL | grep -q attached
    sleep 5

# Mount it
mount /dev/xvdf /mnt

# Now we can back up. Remember what we had though
echo $SNAPSHOT > .snapshot-to-remove
echo $NEWVOL > .volume-to-remove

The first couple of lines will only work on the Amazon cloud. The EC2 instance queries a special Amazon address to find out its own details -- its instance id (e.g. i-121255) and its zone (e.g. us-west-2c).

Then we flush as much out to disk as we can (with the sync commands, and then freeze I/O on the root filesystem). Any thing that tries to write to disk will block until after the snapshot is completed. Read operations will still work. We run a busier loop checking to see if the snapshot is ready.

After that, we turn the snapshot into a volume, and attach that volume to a device which will probably be free. There will be an error in dmesg about the lack of a partition table on /dev/xvdf but it doesn't seem to matter.

Finally we mount /mnt (ready to be backed up) and remember what volumes we just created.


SNAPSHOT=$(cat .snapshot-to-remove)
NEWVOL=$(cat .volume-to-remove)

umount /mnt

# Detach the volume and wait until it is gone
ec2-detach-volume  $NEWVOL
while ec2-describe-instances $NEWVOL | grep -q ATTACHMENT
    sleep 5

ec2-delete-volume $NEWVOL
ec2-delete-snapshot  $SNAPSHOT
After the backup, the post-exec removes the mount, the volume and the snapshot.

I hope you find this helpful.

Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at . He has written numerous books (see ) on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.