This is the 100th blog post, and the counter of page views is about to tick over 50,000. Thank you for your readership!
According to Google's infallible stats counters, here's what most people have been reading on this blog:
POEMS At the end of a day of being a computer nerd, you need something that will make you laugh (or at least smile) and also make you look cultured among your friends. Nobody else writes poetry about nuclear physics or time travel, so if you want to get that "I'm so hip" feel, you really should buy a copy of When Medusa went on Chatroulette for $3 (more or less, depending on your country of origin).
NAVIGATOR - If you run Data Protector then you will definitely get some value out of the cloud-hosted Navigator trial. You can get reports like "what virtual machines are not getting backed up?" and "how big will my backups be next year?" -- stuff that makes you look like the storage genius guru (which you probably are anyway, but this just makes it easier to prove it).
HOW LONG WILL IT TAKE - If you are tracking your work in Atlassian's fabulous JIRA task tracking system, then try out my free plug-in (x.ifost.org.au/aeed) which can predict how long tasks will take to complete. And if you are not using JIRA, then convince everyone to throw out whatever you are using and switch to JIRA because it's an order of magnitude cheaper, and also easier to support.
TRAINING COURSES - You can now buy training online from store.data-protector.net -- and it appears that it's 10-20% cheaper than buying from HPE directly in most countries. There are options for instructor-led, self-paced, over-the-internet and e-learning modules.
SUPPORT CONTRACTS - Just email your support contract before its renewal to gregb@ifost.org.au and I'll look at it and figure out a way to make it cheaper for you.
BOOKS - If you are just learning Data Protector, then buy one of my books on Data Protector (available in Kindle, PDF and hardback). They are all under $10; you can hide them in an expense report and no-one will ever know.
A blog about technology, running tech companies, data science, religion, translation technology, natural language processing, backups, p-adic linguistics, academic lecturing and many other topics.
Search This Blog
Showing posts with label backup. Show all posts
Showing posts with label backup. Show all posts
Thursday, 10 December 2015
Wednesday, 9 December 2015
Stumped by a customer question today: when to replace a cleaning tape
It was an innocent enough question: "when will I need to replace this cleaning tape?"
And I realised that not only did I not know the answer, that in fact, I'd never replaced a cleaning tape in 20+ years of backup work. Sure, I've put one in when I've been deploying a system, but I tend to forget about it after that.
Estimates from drive manufacturers suggest that a tape should be cleaned every month. Looking at backup logs, it looks like tape drives request cleaning about every 6 months.
But that data is mostly from tape drives that are inside a tape library, so the amount of dust getting in and out will be less than for a standalone tape drive.
The spec sheet on HPE's universal LTO ultrium cleaning kit suggests that it should be good for between 15 and 50 cleans.
Put together, that means that a cleaning tape should be replaced somewhere between once every year or so and every quarter century, which is not very helpful!
I believe the data from the tape drives themselves reporting "I'm dirty" rather than the vendor suggestions, so even taking the low end of the HPE spec sheet, a cleaning tape in a tape library should be good for 7 years. Since that's enough for at least two generations of tape technology to come and go, it's probably safe to assume that you will have bought a new tape library in that time.
But if you have multiple tape drives, and it has been a couple of years since you last replaced the cleaning tape, errm, maybe it's worth buying one. I'm not selling tapes at store.data-protector.net yet, so my best suggestion is this vendor on Amazon: LTO ultrium cleaning kit.
Incidentally, if you do have a tape library and you are running HPE Data Protector, then you will almost definitely want to sign up for the free cloud-hosted Backup Navigator trial here: Free Backup Navigator Trial at HPE so that you can see which are your most unreliable tape drives -- perhaps they need cleaning!
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector, or visit the online store for Data Protector products, licenses and renewals athttp://store.data-protector.net/
And I realised that not only did I not know the answer, that in fact, I'd never replaced a cleaning tape in 20+ years of backup work. Sure, I've put one in when I've been deploying a system, but I tend to forget about it after that.
Estimates from drive manufacturers suggest that a tape should be cleaned every month. Looking at backup logs, it looks like tape drives request cleaning about every 6 months.
But that data is mostly from tape drives that are inside a tape library, so the amount of dust getting in and out will be less than for a standalone tape drive.
The spec sheet on HPE's universal LTO ultrium cleaning kit suggests that it should be good for between 15 and 50 cleans.
Put together, that means that a cleaning tape should be replaced somewhere between once every year or so and every quarter century, which is not very helpful!
I believe the data from the tape drives themselves reporting "I'm dirty" rather than the vendor suggestions, so even taking the low end of the HPE spec sheet, a cleaning tape in a tape library should be good for 7 years. Since that's enough for at least two generations of tape technology to come and go, it's probably safe to assume that you will have bought a new tape library in that time.
But if you have multiple tape drives, and it has been a couple of years since you last replaced the cleaning tape, errm, maybe it's worth buying one. I'm not selling tapes at store.data-protector.net yet, so my best suggestion is this vendor on Amazon: LTO ultrium cleaning kit.
Incidentally, if you do have a tape library and you are running HPE Data Protector, then you will almost definitely want to sign up for the free cloud-hosted Backup Navigator trial here: Free Backup Navigator Trial at HPE so that you can see which are your most unreliable tape drives -- perhaps they need cleaning!
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector, or visit the online store for Data Protector products, licenses and renewals athttp://store.data-protector.net/
Wednesday, 25 November 2015
VMware ESX 6.0 bug with CBT
VMware has announced another CBT problem. Just a reminder, this is not a problem that HPE can do anything about in Data Protector -- it's a problem with the APIs that VMware have supplied for HPE to use.
If you are doing VEAgent backups of your VMware environment (which is quite common) and you have any incrementals scheduled (also quite common), and you are running ESX 6 (which is lots of people) and you are using CBT (which you really, really would want to do normally).... then you might want to be aware that (yet again) VMware have announced that your backups could well be painfully broken.
Here's VMware's KB article:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2136854
There are several solutions:
Only do full backups. Hmm, that's a lot of data. Probably OK if you are going to a StoreOnce dedupe, but that's going to turn into a lot more tape.Turn off CBT. Ouch, that's going to hurt performance.Downgrade to ESX 5.5. I don't see anyone doing that.Using the DataProtector disk agent and automated disaster recovery module. This is actually cheaper (no extension licenses required!) and gets you both a file-level backup and an ability to restore a virtual machine from nothing. I recommend this as a better approach generally, but particularly now when we can't trust our VM-level backups.- Apply the patch that VMware has now released.
Less easy solutions, but things to think about:
- Migrate all your virtual machines to Amazon machine images. (Or Google, or Azure. Pity it can't be HP any more). It's inevitable -- eventually -- that the economies of scale of the large cloud providers will overtake your ability to run things in your own data centre. So why not start planning for it now?
- Use a different virtualisation solution. This is not the first time that VMware have announced "by the way, all backups are broken". I suspect it won't be the last time either. KVM is very mature now and it's also free. Xen is in good shape too. Virtualisation technology is no longer cutting edge -- it's commoditised now. So why not pay commodity prices?
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector, or visit the online store for Data Protector products, licenses and renewals athttp://store.data-protector.net/
Thursday, 5 November 2015
Pre-mortem for almost every cloud-hosted backup provider
I was talking to a vendor who wanted me to partner with them on their cloud-hosted backup solution. I looked at their pricing, and their offerings (managed storage in the cloud, DR servers in the cloud) and then compared what they could do with Amazon. Since only Google and Microsoft can compete with Amazon's scale (and then, only just), the vendor's offerings were way out of line with market rates now.
I suggested that they had three options:
I suggested that they had three options:
- They could make their product work nicely with Amazon cloud (i.e. backup to S3, manage the migration to and from Glacier). A variation would be to do this with Google Nearline Storage, which is probably a better solution, even if it doesn't have the same name recognition. They will lose a lot of revenue because there used to be margin in online storage -- but there isn't any more.
- They could migrate their entire customer base to an open source option (Bacula or BareOS). Since their customer base is going to be cannibalised anyway, they might as well make some money from the consulting effort migrating the customer somewhere else. Open source backup can still compete against cloud offerings in a couple of different ways.
- They could become roadkill.
Fortunately, they do have other sources of revenue, so hopefully they will be able to carry on. But for other specialist cloud-backup companies? I'm not sure that they many of them have a viable future.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Thursday, 29 October 2015
MS-SQL server not backing up
When Data Protector tries to backup a SQL server, the SQL agent contacts the cell manager to get the details of the integration (e.g. what username to use, whether to use SQL authentication or Windows authentication).
Today I was diagnosing an error message that I never would have expected to see on a MS-SQL server:
Cannot obtain Cell Manager host. Check the /etc/opt/omni/client/cell_server file and permissions of /etc/resolv.conf file.
I'm not exactly sure how I'm supposed to check /etc/resolv.conf on a Windows system. Maybe C:\Windows\system32\drivers\etc\resolv.conf?
Needless to say, in the agent's extreme confusion, it then followed this with:
Cannot initialize OB2BAR Services ([12:1602] Cannot access the Cell Manager system. (inet is not responding)
The Cell Manager host is not reachable or is not up and running
Not exactly that inet is not responding -- it didn't even know what cell manager to connect to.
We probably would have searched for hours trying to find the cause of it, but I proposed upgrading the client to match the cell server version (good practice anyway). The upgrade wouldn't proceed after we hit the following error message:
"Already part of another cell: cellmgr.ifost.org.au ." Note the extra space!
Staring very carefully at the following registry key, I confirmed that indeed, there was a space at the end of the name where it had been manually edited. I removed the space.
HKEY_LOCAL_MACHINE\SOFTWARE\Hewlett-Packard\OpenView\OmniBackII\Site\CellServer
Normally the "already part of another cell" error message means exactly what it says (because it won't match up with your cell manager's name); that there's a short name instead of a FQDN; or some sort of problem like that.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Thursday, 22 October 2015
HP gives up against Amazon
So the HP public cloud is no more. I suspect I might have been one of the larger users of it (for a few weeks back in 2012) so let me try to give a serious analysis of what this means. (HP announcement link)
Amazon AWS is currently supply-constrained. They could lower prices to gain more customers, but then they wouldn't be able to service those customers. This is an unusual position to be in, as almost all of us are in industries where the bottleneck to growth is in acquisition, not delivery. So they ease their prices down little-bit-by-little-bit as they resolve their supply constraints.
Eventually, AWS will start to be demand-constrained, and that's when all hell breaks loose, because then AWS can start doing some serious price cutting. I'd peg it for early 2017 at a guess, when suddenly the price cuts start accelerating until the economics for renting from AWS starts to look competitive with buying a server and putting it on a desk unsupported, un-networked and unpowered.
Google and Azure can survive Amazongeddon -- they have the money and it's a market that they definitely want to be in. Google App Engine is still a very cost-effective offering -- my total compute and storage budget leading up to launch day (and including it) for the Automated Estimator of Effort and Duration for Jira was $0.22 -- so much for big data analysis being expensive! At that level, price comparisons are utterly meaningless, so if that's profitable now (which is probably is), they can keep doing it.
HP have presumably decided that they don't have enough time to build out a solid customer base on the HP public cloud before Amazongeddon. The HP cloud team is betting that customers will want HP software to manage their clouds, and that an HP-backed public cloud is not worth doing. Operations Orchestration makes sense in a cloudy world, for example.
But there is a problem, because for all the talk of "hybrid public-private clouds", either private is cheaper/better/more secure or public is cheaper/better/more secure.
Unfortunately, I believe the answer is "public", as do many, many other people. To say that "private" clouds are cheaper / better and more secure the majority of the time means that not only are there no economies of scale in a big data centre, that there are diseconomies of scale that are going to appear any moment now from out of nowhere.
This puts HP in the same position as Unisys was in the 1980s-1990s. Customers stopped buying Unisys mainframes, so Unisys had to turn into a services, software and support business. They had a bit of an edge in government and defence at the time, and they worked hard to keep it. I know plenty of people who have had good careers at Unisys, and presumably it's a nice place to work where there is innovation happening. But Unisys in 2015 is not the hallowed place that it was after the Burroughs / Sperry merger.
Without that core of hardware sales on which to stack software sales, Unisys struggled. So too will HP. (And so will Dell, unless Dell decides to take on Amazon... which they could and should.)
I feel sorry for Bill Hilf though, as he has had to lead teams through the collapse of high-end Itanium hardware and now through the failure of the only viable hardware future that HP had.
That said, I'm optimistic about HP Data Protector in particular. There will still be important data to backup and archive. Storing it efficiently for fast recovery will always matter. You can't discard a backup solution until the last of your 7-year-old backups have expired.
I'm hoping that HP will now do three things:
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Amazon AWS is currently supply-constrained. They could lower prices to gain more customers, but then they wouldn't be able to service those customers. This is an unusual position to be in, as almost all of us are in industries where the bottleneck to growth is in acquisition, not delivery. So they ease their prices down little-bit-by-little-bit as they resolve their supply constraints.
Eventually, AWS will start to be demand-constrained, and that's when all hell breaks loose, because then AWS can start doing some serious price cutting. I'd peg it for early 2017 at a guess, when suddenly the price cuts start accelerating until the economics for renting from AWS starts to look competitive with buying a server and putting it on a desk unsupported, un-networked and unpowered.
Google and Azure can survive Amazongeddon -- they have the money and it's a market that they definitely want to be in. Google App Engine is still a very cost-effective offering -- my total compute and storage budget leading up to launch day (and including it) for the Automated Estimator of Effort and Duration for Jira was $0.22 -- so much for big data analysis being expensive! At that level, price comparisons are utterly meaningless, so if that's profitable now (which is probably is), they can keep doing it.
HP have presumably decided that they don't have enough time to build out a solid customer base on the HP public cloud before Amazongeddon. The HP cloud team is betting that customers will want HP software to manage their clouds, and that an HP-backed public cloud is not worth doing. Operations Orchestration makes sense in a cloudy world, for example.
But there is a problem, because for all the talk of "hybrid public-private clouds", either private is cheaper/better/more secure or public is cheaper/better/more secure.
- If the answer is "private", then we will continue to have internal customer-owned datacentres, and HPE will continue to sell 3PARs, SureStores, Proliants and so on.
- If the answer is "public", then after Amazongeddon, HP won't have a hardware business that anyone cares about.
Unfortunately, I believe the answer is "public", as do many, many other people. To say that "private" clouds are cheaper / better and more secure the majority of the time means that not only are there no economies of scale in a big data centre, that there are diseconomies of scale that are going to appear any moment now from out of nowhere.
This puts HP in the same position as Unisys was in the 1980s-1990s. Customers stopped buying Unisys mainframes, so Unisys had to turn into a services, software and support business. They had a bit of an edge in government and defence at the time, and they worked hard to keep it. I know plenty of people who have had good careers at Unisys, and presumably it's a nice place to work where there is innovation happening. But Unisys in 2015 is not the hallowed place that it was after the Burroughs / Sperry merger.
Without that core of hardware sales on which to stack software sales, Unisys struggled. So too will HP. (And so will Dell, unless Dell decides to take on Amazon... which they could and should.)
I feel sorry for Bill Hilf though, as he has had to lead teams through the collapse of high-end Itanium hardware and now through the failure of the only viable hardware future that HP had.
That said, I'm optimistic about HP Data Protector in particular. There will still be important data to backup and archive. Storing it efficiently for fast recovery will always matter. You can't discard a backup solution until the last of your 7-year-old backups have expired.
I'm hoping that HP will now do three things:
- Convert the HP cloud object storage device to something that works with S3. Since this feature will be irrelevant in January 2016 if they don't do this, it seems like a no-brainer in order to preserve the R&D investment done so far.
- Interface into lifecycle management of S3 -- if the "location" of a piece of media is "Glacier", then Data Protector should be able to initiate its re-activation as step 1 of a restore job. Again, this seems a no-brainer if you already are dealing with S3.
- I'd like to see the Virtual Storage Appliance delivered as an AMI (Amazon machine image). This isn't very difficult. Maybe there could be some fiddling around with licensing where the VSA reported its usage and customers paid by capacity per month, but even that's not really necessary.
If all this happens, then I suspect we'll continue to see HP selling Data Protector for another 30 years. If Data Protector is still useful for customers post-Amazongeddon as it is pre-Amazongeddon, then there would be no particular reason that Data Protector couldn't pass through this critical tipping point. In fact, since I doubt that BackupExec will handle the transition, Data Protector will probably pick up some market share.
Anyway, what are some immediate scenarios would this support?
Anyway, what are some immediate scenarios would this support?
- Customer A has a small Amazon presence and a large data centre with a StoreOnce system and some tape drives. They would like to deploy a VSA in the same region as their Amazon servers and replicate their data through low-bandwidth links back to their data centre.
- Customer B has a somewhat larger Amazon presence. They have Data Protector in their office, and they want to backup their Amazon content to Glacier.
- Customer C is closing down their data centre in house and moving their servers into the cloud. They want to take backups of their servers in their data centre and use StoreOnce replication to get them into their cloud where the data is rehydrated.
So if you are customer like A, B or C, feel free to contact to your account manager, suggest that you'd really like Data Protector to support you and see how you go. (Or get in touch with me and I'll collate some answers back to the product team.)
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Labels:
Amazon,
App Engine,
AWS,
backup,
backup operations,
big data,
cloud,
DataProtector,
EC2,
future
Monday, 19 October 2015
Data Protector reporting with Navigator through a firewall
Data Protector has a number of built-in reports, which you can email, put on an intranet, pipe through some other command and various things like that. I wrote up the complete list of built-in reports here:
http://blog.ifost.org.au/2015/04/data-protector-built-in-reports.html
HP's strategic direction for reporting appears to be Backup Navigator. It is licensed purely on the capacity of the cells it is reporting on. (In other words: how big is a full backup of all the data being backed up: that's the capacity you license on.)
It produces some nice reports:
I was working with a customer who had had some problems with connectivity on Navigator 9.1.
Their cell managers had an omnirc file which limited the number of ports open for connections.
OB2PORTRANGESPEC=CRS:20495-20499
Having only five ports open was enough for them. We organised to have ports 20495 - 20499 opened from their Navigator server to their cell manager.
As it turns out, this is not enough. You need to have connectivity open from the cell manager back to the Navigator server as well. This isn't documented anywhere, and there's no error report from Navigator about this.
This problem goes away somewhat in Navigator 9.21 and 9.3 because you can do agent-based push. This is where you run a program on your cell manager which connects to Navigator on port 443 (HTTPS) and uploads the information that Navigator needs.
This solves the problem in two ways:
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
http://blog.ifost.org.au/2015/04/data-protector-built-in-reports.html
HP's strategic direction for reporting appears to be Backup Navigator. It is licensed purely on the capacity of the cells it is reporting on. (In other words: how big is a full backup of all the data being backed up: that's the capacity you license on.)
It produces some nice reports:
I was working with a customer who had had some problems with connectivity on Navigator 9.1.
Their cell managers had an omnirc file which limited the number of ports open for connections.
OB2PORTRANGESPEC=CRS:20495-20499
Having only five ports open was enough for them. We organised to have ports 20495 - 20499 opened from their Navigator server to their cell manager.
As it turns out, this is not enough. You need to have connectivity open from the cell manager back to the Navigator server as well. This isn't documented anywhere, and there's no error report from Navigator about this.
This problem goes away somewhat in Navigator 9.21 and 9.3 because you can do agent-based push. This is where you run a program on your cell manager which connects to Navigator on port 443 (HTTPS) and uploads the information that Navigator needs.
This solves the problem in two ways:
- With the new agent model, there's no need to open anything from the Navigator server to the cell manager, so you can put the Navigator server in a quite isolated network.
- It's entirely possible to run the Navigator server in the cloud (HPE offer a three month trial) and have your Data Protector reporting handled by a third party.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Sunday, 5 April 2015
Who changed that backup specification?
Backups can fail spontaneously, but generally I've found that the most common reason for a Data Protector backup to fail is because somebody has changed something.
There are two solutions to this, depending on the level of tracking you need.
For the light touch, there is a global parameter "EventLogAudit" which you can set to 1.
It seems like you just need to restart the GUI for this to take effect, even though you will get a message saying that you need to restart all Data Protector services.
After you have made this change, you can then see when a change has been made. So, to demonstrate, I removed a file server from the backup job called "LegacyServers". This is what appeared in the event log:
It doesn't show any more detail than this though. You can't tell what was changed.
If you need tracking of precisely what was changed and when, then you need to implement a version control system. I'll put that in another blog post, but it's not difficult.
The screenshots below are using TortoiseGIT (which is a Windows front-end to git). Because Data Protector's job specification format is a collection of plain-text files, this kind of version control works particularly well. The benefits are:
We can even run a "diff" command to show precisely what changed between two versions.
Never get caught out by a changed backup again!
If you found this blog post helpful, you will probably really enjoy my books on Data Protector: http://www.ifost.org.au/books.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/press/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
There are two solutions to this, depending on the level of tracking you need.
For the light touch, there is a global parameter "EventLogAudit" which you can set to 1.
It seems like you just need to restart the GUI for this to take effect, even though you will get a message saying that you need to restart all Data Protector services.
After you have made this change, you can then see when a change has been made. So, to demonstrate, I removed a file server from the backup job called "LegacyServers". This is what appeared in the event log:
It doesn't show any more detail than this though. You can't tell what was changed.
If you need tracking of precisely what was changed and when, then you need to implement a version control system. I'll put that in another blog post, but it's not difficult.
The screenshots below are using TortoiseGIT (which is a Windows front-end to git). Because Data Protector's job specification format is a collection of plain-text files, this kind of version control works particularly well. The benefits are:
- Each commit shows exactly the change that was performed and why.
- It's very easy to revert changes, even parts of changes.
Let's say that someone has made a change. Apart from the Event Log message, the relevant folder will look like this (note the red exclamation mark, signalling a file that has not been committed):
The administrator fixes this by running a commit:
To commit, the administrator needs to provide a log message, to explain why this was done. You can put in the change request number if this makes sense, or any other comment at all.
Once that has completed, we can now see that all changes that have been made to the Data Protector environment have been documented in the version management system. Green tick marks everywhere!
So if tonight's backup fails, or we discover later that "fileserver" is not being backed up and we don't know why, we can look back through the history of the backup job:
We can even run a "diff" command to show precisely what changed between two versions.
Never get caught out by a changed backup again!
If you found this blog post helpful, you will probably really enjoy my books on Data Protector: http://www.ifost.org.au/books.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/press/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Thursday, 29 January 2015
Using Data Protector to back up your AWS (Amazon EC2) instances
Your cloud-hosted infrastructure (whether it is on Azure, AWS, Google App Engine, IBM Rackspace or HP Cloud) is going to consist of:
- Cattle, which are machines that you have automatically created and which contain no state that can be lost. They might have a replica of some data, but there will be other copies. If these machines fail, you just restart them or create a new instance. Hopefully you have this process automated.
- Pets, which are machines that you administer and are installed manually. When these fail, you want to restore from a backup.
If you are a completely green-field site, then you won't have any backup infrastructure. But if you already have some in-house servers, then you will probably have existing backup infrastructure that you would want to make use of.
For example, the cheapest storage in the cloud at the moment appears to be Amazon Glacier, which costs USD10 per terabyte. But if you already have a tape library (or even a single standalone modern tape drive), you can easily have long-term cold storage at $0.50 per TB or less, and you probably already have some tapes.
Likewise, if you already have a Data Protector cell manager license, you might as well keep using it because it will work out cheaper than any dedicated cloud-hosting provider.
Virtual tape library
This option is appropriate if you are very, very constrained by your budget and need to be very conservative in how you do backup changes. If you are currently backing up to a tape library, then this lets you keep the illusion of the same thing but put it into the cloud.
- Create a Linux instance in an availability zone that you are not otherwise using.
- Install mhvtl on to it, and configure a virtual tape library with it.
- Mount a very large block image (persistent storage device) on /opt/mhvtl.
- You can now use this tape library just as if it were a real tape library.
AWS (Elastic Block Store) volumes
The problem with the virtual tape library solution is that you are somewhat constrained by the size of the block storage that you are using. But with an external control device, you can attach and detach Elastic Block Store (EBS) volumes on demand as required. You can add slots to the external device by adding additional block stores.
- Create a Linux instance in an availability zone that you are not otherwise using.
- Write an external control script which takes the DP command arguments and attaches and detaches EBS volumes to the Linux box.
- Create an External device, using that script.
StoreOnce low-bandwidth replication
The previous two options don't offer a way of using in-house tape drives.
If you have a way of breaking up your backups into chunks of less than 20TB, then you can use the software StoreOnce component on an EC2 instance. It works on Windows and Linux; just make sure that you have installed a 64-bit image. The only licensing you will need is some extra Advanced Backup to Disk capacity.
An alternative is to buy a virtual storage appliance (VSA) from HP, and then creating an Amazon Machine Image (AMI) out of it. This has the advantage that it can cope with larger volumes, and it also has better bandwidth management (e.g. shaping during the day, and full speed at night).
The steps here are:
- Run up a machine in an availability zone which is different to whatever it is you are wanting to back it up. Use a Windows, Linux or VSA image as appropriate. Call it InCloudStorage-server
- Create a StoreOnce device (e.g. "InCloudStorage").
- Create backups writing to "InCloudStorage".
- Create a StoreOnce device in-house. Call it "CloudStorageCopy".
- If you are not using a VSA you will need to create a gateway for CloudStorageCopy on InCloudStorage-server. Remember to check the "server-side deduplication button".
- Create a post-backup copy job which replicates the backups (which went to InCloudStorage) to CloudStorageCopy using that server-side deduplicated gateway.
- Create a post-copy copy jobs to copy these out to tape.
The beauty of this scheme is that you can seed the CloudStorageCopy with any relevant data. As most of the virtual machines you are backing up will be very similar, you will achieve very good deduplication ratios. 20:1 is probably reasonable to expect, or possibly higher. So instead of having to transfer 100GB of backup images from the cloud to your office each day, you might only be transfering 5GB, which is quite practical.
HP cloud device
I discussed this in http://blog.ifost.org.au/2015/01/data-protector-cloud-backups-and-end-of.html . If you are using the HP cloud, then this is almost a no-brainer -- you don't even need to provision a server. For the other cloud providers, it depends on the bandwidth you get (and the cost of the bandwidth!) to the HP cloud whether this makes sense or not.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/press/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Monday, 28 July 2014
When Exchange 2010 won't backup after a Data Protector upgrade
I have had two customers upgrade Data Protector, and suddenly have their Exchange backups fail.
Here's the characteristic error message:
What's going on is that in DP 8.x and onwards, HP has added support for Exchange 2013, and so the "Exchange 2010" backups have been renamed to "Exchange 2010+". But the upgrade script doesn't reliably (ever?) update the barlist.
Simply open up the barlist file in a text editor. Look for where it says "2010" and replace it by "2010+".
Confirmed to affect DP 8.1 and 9.0, for upgrades from 6.2 and 7.x.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published book on HP Data Protector (http://x.ifost.org.au/dp-book). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Here's the characteristic error message:
[Major] From: BSM@exchange.ifost.org.au "Exchange 2010 Databases Backup" Time: 26/07/2014 9:17:02 PM
[61:8000] Client named "MS Exchange 2010 Server" not configured in the backup specification.
[Major] From: BSM@exchange.ifost.org.au "Exchange 2010 Databases Backup" Time: 26/07/2014 9:17:02 PM
Unknown internal error.
What's going on is that in DP 8.x and onwards, HP has added support for Exchange 2013, and so the "Exchange 2010" backups have been renamed to "Exchange 2010+". But the upgrade script doesn't reliably (ever?) update the barlist.
Simply open up the barlist file in a text editor. Look for where it says "2010" and replace it by "2010+".
Confirmed to affect DP 8.1 and 9.0, for upgrades from 6.2 and 7.x.
Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published book on HP Data Protector (http://x.ifost.org.au/dp-book). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Thursday, 19 June 2014
VMware, Data Protector and virtual machines which won't consolidate
Working on a customer's systems recently, there were a large number of virtual machines with the following error message:
But if I tried to right-click in vCenter and select Snapshots -> Consolidate, what I got was "unable to access file <unspecified filename> since it is locked".
This was also causing error messages in the backup log, because HP Data Protector attempts to consolidate disks at the start of a full backup.
The VMware KB articles suggested various things to identify the lock. I ssh'ed in and ran
One of them was the process running the virtual machine (no surprises there), and the other was a process belonging to the hostname of the computer that runs their HP DataProtector VEPA agent.
This customer has a virtual machine inside their VMware environment which runs their VMware backups. They don't have to worry about correctly presenting LUNs or having an extra device attached to their SAN fabric. They do source-side deduplicated backups from this virtual machine, so it doesn't generate as much network traffic as it otherwise would.
What had happened was that some backup had failed spectacularly leaving the snapshots mounted on the VEPA agent virtual machine. Looking at the settings for the agent virtual machine it proudly said that it had 13 virtual disks - when it should only have had one, its boot disk.
Naturally, VMware couldn't consolidate the snapshots because as far as it was concerned, those snapshots were still in use. VMware also couldn't delete the virtual disks off the agent machine either, because there were snapshots depending on them.
So the solution was:
Configuration Issues
Virtual machine disks consolidation is needed.
But if I tried to right-click in vCenter and select Snapshots -> Consolidate, what I got was "unable to access file <unspecified filename> since it is locked".
This was also causing error messages in the backup log, because HP Data Protector attempts to consolidate disks at the start of a full backup.
The VMware KB articles suggested various things to identify the lock. I ssh'ed in and ran
tail -f vmware.log | grep lockto identify what the lock could be. As it turned out, it wasn't quite a lock. The file that couldn't be opened was a .vmdk file - no surprises there. So I ran
lsof | grep the-vmdk-fileThis showed that two different processes had it open.
ps | grep process-id-from-the-previous-stepshowed that the two processes were both /bin/vmx, but it was possible to distinguish them by their child vmx-vthread processes.
One of them was the process running the virtual machine (no surprises there), and the other was a process belonging to the hostname of the computer that runs their HP DataProtector VEPA agent.
This customer has a virtual machine inside their VMware environment which runs their VMware backups. They don't have to worry about correctly presenting LUNs or having an extra device attached to their SAN fabric. They do source-side deduplicated backups from this virtual machine, so it doesn't generate as much network traffic as it otherwise would.
What had happened was that some backup had failed spectacularly leaving the snapshots mounted on the VEPA agent virtual machine. Looking at the settings for the agent virtual machine it proudly said that it had 13 virtual disks - when it should only have had one, its boot disk.
Naturally, VMware couldn't consolidate the snapshots because as far as it was concerned, those snapshots were still in use. VMware also couldn't delete the virtual disks off the agent machine either, because there were snapshots depending on them.
So the solution was:
- Remove the snapshots on the agent machine.
- Remove the extraneous disks from the agent machine.
- Run the snapshot consolidation from the vCenter GUI.
Tuesday, 10 June 2014
When VMware NBD and NBDSSL backups fail
I was working on some VMware backups when I ran into this strange sequence of messages: a backup which is showing that that is quite possible to backup a VMX file, but not the VMDK files. And the error messages in the session aren't very informative!
Everything's pretty much fine. There are lots of reasons for a vMotion lock to fail.
And now for the interesting part:
I've truncated the rest of the messages.
Turning up the debugging level, the debug logs showed this:
There can be many reason for this connection to fail: a firewall could be blocking the connection between the vepa agent and the esx server. Or in this case, there was no DNS entry for esxi.ifost.org.au didn't exist.
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. He is the author of the only published book on HP Data Protector (http://x.ifost.org.au/dp-book). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
[Normal] From: BSM@cell-manager.ifost.org.au "NBD test backup" Time: 19/05/2014 10:30:16 AM Backup session 2014/05/19-5 started. [Normal] From: BSM@cell-manager.ifost.org.au "NBD test backup" Time: 19/05/2014 10:30:16 AM OB2BAR application on "vepa-agent.ifost.org.au" successfully started. [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:30:17 AM Resolving objects for backup on vCenter 'vcenter.ifost.org.au' ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:30:33 AM Add Virtual Machine to the backup ... Name: VM1 Path: /DC/Discovered virtual machine/VM1 InstanceUUID: 52dbf234-252e-c5dd-9df5-51c304bcf312 [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:30:35 AM Virtual Machine 'VM1': Locking vMotion ... [Warning] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:30:35 AM Virtual Machine 'VM1': vMotion is in Progress. [Warning] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:31:08 AM Virtual Machine 'VM1': Could not lock vMotion.
Everything's pretty much fine. There are lots of reasons for a vMotion lock to fail.
[Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:31:13 AM Creating folder /var/opt/omni/tmp/55fd5af8-f853-403e-bedf-2d1e60e418dd ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:31:19 AM Virtual Machine 'VM1': Backing up configuration file VM1.vmx ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:31:20 AM Virtual Machine 'VM1': Creating snapshot ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:31:59 AM Virtual Machine 'VM1': Optimizing disk scsi0:0 ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:32:00 AM Virtual Machine 'VM1': Backing up VSS manifest VM1/VM1-vss_manifests11.zip.
And now for the interesting part:
[Major] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:33:15 AM Virtual Machine 'VM1': Could not backup disk scsi0:0 ... [Major] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:33:15 AM [172:162] Virtual Machine 'VM1': No disk backed up ... [Critical] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:33:15 AM Backup of object failed. Name: VM1 Path: /DC/Test_VMs/VM1 InstanceUUID: 52dbf234-252e-c5dd-9df5-51c304bcf312 [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.org.au "/DC" Time: 19/05/2014 10:33:16 AM Virtual Machine 'VM1': Removing snapshot ... [Normal] From: VEPALIB_VMWARE@vepa-agent.ifost.or.gau "/DC" Time: 19/05/2014 10:33:24 AM Virtual Machine 'VM1': Unlocking vMotion ... Deleted directory /var/opt/omni/tmp/564d01b7-7910-97a0-d54d-85c11ff8becd-vm-58/nbd Deleted directory /var/opt/omni/tmp/564d01b7-7910-97a0-d54d-85c11ff8becd-vm-58/nbdssl Deleted directory /var/opt/omni/tmp/564d01b7-7910-97a0-d54d-85c11ff8becd-vm-58/hotadd [Normal] From: BSM@cell-manager.ifost.org.au "NBD test backup" Time: 19/05/2014 10:33:53 AM OB2BAR application on "vepa-agent.ifost.org.au" disconnected.
I've truncated the rest of the messages.
Turning up the debugging level, the debug logs showed this:
The clue is the failed connection to esxi1.ifost.org.au. The VEPA backup agent obviously has to connect to the Vcenter server in order to start a backup, but because there was no SAN connectivity between the VEPA agent and the LUNs supporting the VM1 virtual machine's VMDK files, the VEPA agent ends up having to talk to the ESX server directly as well.[110] [VddkUtil::diskLibLog] NBD_ClientOpen: attempting to create connection to vpxa-nfcssl://[ESXi-MGMT-VMFS-1] VM1/VM1.vmdk@esxi1.ifost.org.au:902 [110] [VddkUtil::diskLibLog] Started up WSA [110] [VddkUtil::diskLibLog] CnxOpenTCPSocket: Cannot connect to server esxi1.ifost.org.au:902: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [110] [VddkUtil::diskLibLog] CnxAuthdConnect: Returning false because CnxAuthdConnectTCP failed [110] [VddkUtil::diskLibLog] CnxConnectAuthd: Returning false because CnxAuthdConnect failed [110] [VddkUtil::diskLibLog] Cnx_Connect: Returning false because CnxConnectAuthd failed [110] [VddkUtil::diskLibLog] Cnx_Connect: Error message: Failed to connect to server esxi1.ifost.org.au:902 [ 20] [VddkUtil::diskLibWarning] [NFC ERROR] NfcNewAuthdConnectionEx: Failed to connect to peer. Error: Failed to connect to server esxi1.ifost.org.au:902 [110] [VddkUtil::diskLibLog] NBD_ClientOpen: Couldn't connect to esxi1.ifost.org.au:902 Failed to connect to server esxi1.ifost.org.au:902
There can be many reason for this connection to fail: a firewall could be blocking the connection between the vepa agent and the esx server. Or in this case, there was no DNS entry for esxi.ifost.org.au didn't exist.
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. He is the author of the only published book on HP Data Protector (http://x.ifost.org.au/dp-book). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Tuesday, 13 May 2014
Backing up a single server
Last week I was talking to a reseller (which is not surprising, almost all my clients are resellers or channel partners of some kind) who was asking about cost-effective options he could resell to backup a single stand-alone server for one of his clients.
Obviously, there are the built-in programs and numerous free programs, but my quick grab-bag of reseller-friendly options:
Obviously, there are the built-in programs and numerous free programs, but my quick grab-bag of reseller-friendly options:
- rsync.net. They have outstanding support, and only use open tools. They support all versions of Unix and Linux. You don't end up locked into anything complicated. They are the highest-priced, but offer the best reseller discounts particularly at high volumes.
- LiveVault. This is HP's monthly-fee backup to the cloud. While the pricing looks high, it's a price for seven years of storage. And again, like rsync.net the discount is based on the total across all your customers, so you can either give unbeatable discounts, or get extra margin.
- DataProtector Single Server edition. This is the same as the enterprise version of HP Data Protector, but licensed only for a single server to write to tape. This is not the same as DataProtector Express, which was a tiny free product that used to come with HP tape drives. (Single Server edition is product number B7030BA which you can buy at http://store.data-protector.net/).
Quite often though, customers wanting a higher level of assurance around their long-term backups might well be advised also to investigate:
- Using Google Apps + Spanning backup as this removes a huge number of localised physical threats.
- Storing everything (including business documents) into subversion. While software developers prefer git, auditors prefer subversion and non-technical people find it easier. This automatically meets a lot of the ISO9000 documentation requirements, allows simple retrieval back to older versions, and can be hosted (e.g. by rsync.net or by cloudforge.com).
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. He is the author of the only published book on HP Data Protector (http://x.ifost.org.au/dp-book) See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector
Thursday, 8 May 2014
DataProtector undocumented features -- running a script against the database
In my (in)famous write-up of how to un-break the DataProtector IDB after an upgrade (http://www.ifost.org.au/dataprotector/documents/dp8db-hack.pdf if you haven't seen it already), I gave a convoluted procedure for getting SQL-level access to the database.
I've since discovered that the password is simply Base64 encoded in the $OMNICONF/config/server/ID/idb.config file.
But today I discovered an undocumented omnidbutil feature:
Much easier and I'm going to put that into the next edition of my Data Protector book.
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector.
I've since discovered that the password is simply Base64 encoded in the $OMNICONF/config/server/ID/idb.config file.
But today I discovered an undocumented omnidbutil feature:
omnidbutil -run_script sqlcommands.sql -detailJust create commands that you want to run in a text file. If you are on Windows, make sure the SQL file is saved in ASCII format rather than Unicode.
Much easier and I'm going to put that into the next edition of my Data Protector book.
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector.
Monday, 5 May 2014
DataProtector + LiveVault
HP have three backup products: DataProtector, LiveVault and Connected. LiveVault and DataProtector overlap somewhat because they are both about backing up servers.
LiveVault is designed to be a cloud-enabled backup, with an on-site in-house cache as an extra. It has no option for backing up to tape.
HP DataProtector is designed for in-house backup to tape or to a deduplication store. You can set up a deduplication store on any cloud-hosted server and do very low bandwidth replication to it.
LiveVault can backup Windows or Linux, and it has integrations with SQL, Exchange, VMware and Hyper-V. HP DataProtector has all of these too, plus other integrations.
DataProtector can control LiveVault jobs too. But it's not obvious when it makes sense to use the LiveVault integration to do a backup instead of DataProtector.
Looking at the Australian pricing, here are the scenarios where a DataProtector customer will do better with LiveVault than with DataProtector.
What else are you using the LiveVault integration for?
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. See more at IFOST's DataProtector pages http://www.ifost.org.au/dataprotector.
LiveVault is designed to be a cloud-enabled backup, with an on-site in-house cache as an extra. It has no option for backing up to tape.
HP DataProtector is designed for in-house backup to tape or to a deduplication store. You can set up a deduplication store on any cloud-hosted server and do very low bandwidth replication to it.
LiveVault can backup Windows or Linux, and it has integrations with SQL, Exchange, VMware and Hyper-V. HP DataProtector has all of these too, plus other integrations.
DataProtector can control LiveVault jobs too. But it's not obvious when it makes sense to use the LiveVault integration to do a backup instead of DataProtector.
Looking at the Australian pricing, here are the scenarios where a DataProtector customer will do better with LiveVault than with DataProtector.
- You have some small branch offices or cloud-hosted servers with less than 25GB to backup and you don't have an Advanced Backup to Disk license for DataProtector. The smallest Advanced Backup to Disk licenses is for 1TB and even though the per-GB costs are lower they aren't 40 times cheaper.
- You have some cloud-hosted servers in the HP Public Cloud or Amazon AWS and you don't want to use the network-to-network VPN options that Amazon and HPPC offer. You might even have servers in a DMZ where you can't open up port 9387 and 9388 to do a StoreOnce backup. It makes sense to use LiveVault because you will have better bandwidth from the LiveVault servers to your cloud servers than you would have from an in-house data centre.
- You have a large number of small SQL server databases spread out over lots of computers. With DataProtector you would be paying for an integration license for each SQL server; with LiveVault you only pay for the volume of data.
- You have a large numbers of small MS-Exchange, VMware or Hyper-V servers. It's the same situation as for SQL server, but I don't think I've ever seen any site where this any of these are small enough.
- A pair of small TurboRestore appliances spread across two data-centres can sometimes work out more cost-effective than StoreOnce software stores, but it only seems to work out for a few particular sizes. As far as I can tell it only works out for almost precisely 8TB or 12TB.
What else are you using the LiveVault integration for?
Greg Baker is an independent consultant working on HP DataProtector, LiveVault and many other technologies. See more at IFOST's DataProtector pages http://www.ifost.org.au/dataprotector.
Subscribe to:
Posts (Atom)











