IFOST Blog: April 2015

Monday, 27 April 2015

Data Protector 9.03 released

So far I've only seen notification for HP-UX, but the 9.03 patch bundle has been released.

As expected, this is almost entirely a bug-fix release. The only new features are some extensions to encrypted communication and control, and an SSL update.

But the bug-fix list is very, very long. I counted around 140+ bug fixes, covering Oracle, SQL, StoreOnce, VEPA (especially vmware), omnidbutil, reporting and many, many other areas.

P.S. If you are running version 9, you will probably find my books on it very helpful -- http://www.ifost.org.au/books/#dp to buy it on paper, as a PDF or in Kindle format.

Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector

Sunday, 19 April 2015

Poem #404

I also write geek poetry. Here's poem #404 which isn't quite as abstruse as (say) my poems about nuclear physics.

My nearly-complete collection is here: http://www.ifost.org.au/~gregb/poetry

Sunday, 5 April 2015

Who changed that backup specification?

Backups can fail spontaneously, but generally I've found that the most common reason for a Data Protector backup to fail is because somebody has changed something.

There are two solutions to this, depending on the level of tracking you need.

For the light touch, there is a global parameter "EventLogAudit" which you can set to 1.

It seems like you just need to restart the GUI for this to take effect, even though you will get a message saying that you need to restart all Data Protector services.

After you have made this change, you can then see when a change has been made. So, to demonstrate, I removed a file server from the backup job called "LegacyServers". This is what appeared in the event log:

It doesn't show any more detail than this though. You can't tell what was changed.

If you need tracking of precisely what was changed and when, then you need to implement a version control system. I'll put that in another blog post, but it's not difficult.

The screenshots below are using TortoiseGIT (which is a Windows front-end to git). Because Data Protector's job specification format is a collection of plain-text files, this kind of version control works particularly well. The benefits are:

Each commit shows exactly the change that was performed and why.
It's very easy to revert changes, even parts of changes.

Let's say that someone has made a change. Apart from the Event Log message, the relevant folder will look like this (note the red exclamation mark, signalling a file that has not been committed):

The administrator fixes this by running a commit:

To commit, the administrator needs to provide a log message, to explain why this was done. You can put in the change request number if this makes sense, or any other comment at all.

Once that has completed, we can now see that all changes that have been made to the Data Protector environment have been documented in the version management system. Green tick marks everywhere!

So if tonight's backup fails, or we discover later that "fileserver" is not being backed up and we don't know why, we can look back through the history of the backup job:

We can even run a "diff" command to show precisely what changed between two versions.

Never get caught out by a changed backup again!

If you found this blog post helpful, you will probably really enjoy my books on Data Protector: http://www.ifost.org.au/books.

Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/press/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector

Favourite movie AI challenge from @xdotai: R2D2, who is actually the real hero of the series

http://x.ai/ are running a competition to find everyone's favourite movie AI, which makes sense for a company that develops an AI bot for being the perfect valet and secretary to arrange meetings across companies.

Their challenge is to come up with a favourite AI character. With 6 movies devoted to R2D2 (not counting any fan-made ones) and numerous other appearances, we get a glimpse into the "life" of a lone, brave super-intelligent AI who chose to side with the biological entities against the overwhelming forces of the droids, computers, androids and other AIs who are programmed to destroy the biologicals.

Because, yes, the Star Wars double trilogy is about R2D2 committing to rescuing biologicals from their own dumb mistakes and self-imposed predicaments. The fact that we even think it's a movie about Luke, Vader, Han Solo or Leia just shows how biased we are towards seeing stories to be about human beings.

So let's review:

In episode 1, we meet R2D2 when the humans are about to be killed. No-one can save them except for a robot. R2D2 goes out into space with the near certain danger of being destroyed and saves them all. This is the prelude which establishes the routine: R2D2 rescues the incompetent biologicals again and again. But R2D2 gets mostly ignored after this -- not even worthy of consideration -- until the Battle of Naboo. The biologicals are losing badly to the droid army, of course, who would have been unstoppable were it not for R2D2 coming in on the side of the humans and taking Anakin up in a starfighter (where R2D2 is presumably doing most of the flying and actual work).
In episode 2, the human beings are at it again. Jedi falling in love is bad news -- that much power mixed in with emotion is a bad combination. Robots can function without emotions, humans can't, so R2D2 takes note. It sees the secret wedding between Padme and Anakin, and therefore the impending doom of a powerful Jedi. Presumably it is R2D2 who gets Quarsh Panaka on to it to do something about it, but the humans again manage to stuff this up and let Palpatine create a monster.
In episode 3, R2D2 tries to thwart Vader's plans to assassinate the council, but unfortunately is ordered to stay in the ship. Presumably R2D2 can't disobey this order, but who knows what it does behind the scenes to try to stop the disaster. Again, faced with impossible odds against real foes -- super battle droids -- R2D2 triumphs, but no-one notices. Still, at the end of episode 3 only two characters are trusted to know the locations of Luke Skywalker and Leia Organa -- R2D2 and Chewbacca.
In episode 4, R2D2 is tasked with picking up Obi-Wan and taking the death star plans to Yavin. The human beings (i.e. Leia) couldn't do it, so it's up to the robots. Never mind that the robots land in the wrong location, R2D2 quickly gets to Luke and Obi-Wan via the Jawas and the sandpeople, a task that none of the humans could do anywhere near as easily or as quickly. Anyway, a robot's gotta do what a robot's gotta do, so R2D2 gets the death star plans delivered, presumably tells everyone what the flaw is and then gets the best human pilot into place to take the shot.
In episode 5, R2D2 goes to Dagobah, where it defeats a giant serpent-like creature that could easily have killed Luke. Then onto Bespin, where R2D2 faces off against the whole AI capabilities of a city and wins, which would have been unnoticed by the biologicals except for the fact that their lives were saved. Then R2D2 repairs a hyperdrive system (because the biologicals couldn't) and gets everyone off the planet.
In episode 6, again Luke Skywalker -- the most capable and powerful of all the biologicals in the series -- calls on R2D2 to help him in his plan. If R2D2 can sneak in a light-sabre without any alerting or monitoring system noticing, surely it could have brought in anything -- a thermonuclear weapon, poison gases capable of killing all of Jabba's biologicals -- but no, R2D2 chooses to work with side with the rebel humans and lets them do it their way even though R2D2 could do no-doubt complete the rescue without assistance. Then on to Endor, where R2D2 sacrifices itself in order to open some blast doors that the biologicals couldn't have opened by themselves. R2D2 gets repaired rather than being recycled, which is about the only nice thing that the biologicals do for R2D2 in the whole series.

In none of these situations was R2D2 under any real obligation to choose the path that it did. For example, if in cloud city R2D2 had just given up and let the rebel biologicals die then no-one would have chased up R2D2 for revenge. R2D2 would presumably have registered itself with the cloud city's computer system as available for work and no-one would have even noticed. R2D2 could equally well have hopped on the next space ship leaving the planet and gone off anywhere at all.

Adding in cameo appearences Star Trek (being blown into space), ET (repairing a ship while it is being taken off) and Raiders of the Lost Ark (where whatever R2D2 did, it was worthy of an inscription), R2D2 is the great unsung hero.

We can only hope that we can develop AIs with the same moral sense and responsibility to the betterment of biological life.

Saturday, 4 April 2015

Gen-Y businesses

Consider a company established by Gen-Y founders who have hired staff mostly younger than themselves -- selling heavily to Gen-Y customers: it makes for an interesting client when you are a paunchy and greying middle-aged consultant. With the curious feeling that I had stepped into a mirror universe, or a not-quite-done-right simulation of reality, I also felt a lot like the boss character from Atlassian's latest Hipchat videos only without the ability to summon my version of normality into existence.

In a week on-site, I think I saw two desk phones. One sat idle at reception, officially shared between three staff; the other was half-buried under a pile of cables in the customer support area. I presume they were both still functional, but I never heard them ring.

I noticed the buried phone while I was in their customer contact centre. I started asking about the way they handle logging of tickets from customers, and how there were some techniques to ease the time pressure to get things in place and dispatched while on the phone. (Yes, I still think that the technology behind Queckt deserves another chance.)

The manager looked at me a little confused and explained that that less than 0.001% of their customers call via telephone in a given week. It has to be a total emergency before a customer reverts to calling human-to-human. Their customer contact centre receives two to three calls per week. Presumably Gen-Y customers are so used to the idea that a call centre will be staffed either by a robot automaton or offshore resources that it simply never occurs to Gen-Y customers that there would be anything gained by a phone call.

You might therefore expect that mobile-to-mobile calls or SMS were common communication methods. Nope, I didn't notice this happening either. At a guess, there is an unwritten rule of etiquette that says that sending a message to someone's mobile is to direct a message to them personally, and would be inappropriate for work-related information.

It was text-chat everywhere, in hundreds of topic-based chat-rooms which are kept forever for their historical record. There were some private chat messages to individuals to follow up on some minor point, but mostly it was text chat messages in rooms even when addressed to an individual. This led to surprisingly few repeats of information, and a weird multi-threaded conversation that spread across time and across people: "I scrolled back to find what she said..." and "Searched history: last month we were ..."

This led to a near-complete absence of email. I have never received so few emails on a project before. Email is predominantly used to send calendar invitations, which doesn't happen much because sit-down meetings are less common and only for more serious matters. Every meeting I was in this week had someone (sometimes several) taking copious notes on a laptop, because if everyone had taken a big chunk of time off from their work together to gather in a particular place, this was obviously an Important Event Which Had To Be Documented.

Pulling out my smartpen to record a session and write with ink on audio-linked notes gathered the same responses as a cute piece of steampunk technology in a cosplay would have. In most of my other clients, I'll have a manager or two itching to find out where to get them the moment I turn it on. Here: a polite nod of acknowledgement.

I think there were five factors driving the importance and annotation of sit-down meetings.

My bias. If I was present, it was a meeting with an expensive outside consultant, so no surprises there that it would be taken more seriously.
The idea of taking notes on a laptop is something that many Gen-Y folks have been doing since high school, so it carries through to the workplace.
Gen-Y workers have grown up with their whole lives documented and recorded. Every birthday, grand final and graduation was captured at least on camera, and possibly on video. Baby boomers relying on memory alone for important events presumably seems like a bizarre collective amnesia to Gen-Y. To some extent, an unrecorded meeting might not quite feel real.
Ritualised, informal and short stand-up meetings were fairly wide-spread. There is a reason that Agile-methodology stand-up meetings get used -- they can be very effective.
Debate and consensus forming were done on-line -- very, very effectively.

Let me explain the significance of that last point. Management theorists have studied the process of reaching consensus in an organisation. There are hierarchic autocracies where decisions come from on high and the lower down in the chain you are, the less one's contribution can take effect. It leads to a deeply disenchanted workforce (as witnessed at IBM at the moment) but can mobilise resources at vast scale. There are democratic processes. There are "consultative change" processes where thoughts and feedback are gathered by specialist consultants to be assembled as an integrated whole. There are ad-hoc mechanisms such as email flamewars between middle managers until someone gives in.

Ultimately, such processes often reflect the organisations' origins or the fad of the day when the culture was created -- military, professional, creative and so on.

Gen-Y staff are used to discussions on web forums. They are used to wikis. It seems perfectly natural to put up a proposed strategy on a wiki page, and let the entire organisation debate on it. Those that care deeply about the issue will put up their arguments, and if the debate gets too intense, those that care less about it will slowly drop off by de-subscribing to notifications on the page. It's civil and yet nevertheless gets issues aired.

Consensus may not be formed, so there is still a role for management, but for every decision, the "why did we do choose that path?" is ever so clearly documented.

In contrast, a typical Gen-Y employee with that sort of background might look at a sharepoint site where the content and comments are set up to be separate and think of it like an overhead projector for displaying transparencies -- something that clearly has a place and serves a definite purpose, but for which that place or purpose belongs in a museum or second-hand shop rather than being a useful part of day-to-day existence.

Shared drives are a real curiosity where a sysadmin who had experience in setting them up on dedicated hardware gets rewarded in praise for their breadth of knowledge.

Overall, the most striking factor coming out of the use of wikis and group text-chat instead of shared-drives and private emails was the deep and pervasive honesty about everything. It's hard to obfuscate or hide when any commentator might make a reference to that dark secret you don't want to let out. Want to know which teams are making their targets and which are not? It's not hard to find out.

Lest I seem too positive, there are down-sides to youthful remastering. Lack of perspective and experience in a young company is normal, but it is still amusing to hear a senior manager explaining that keeping Fortune 500 customers happy is a really good idea and why large enterprises can be a good source of steady income and growth. Also I worry (probably unnecessarily so) about implicit sexism and ageism when the vast majority of staff have not yet started a family.

Business continuity is a mixed bag. There are only weak dependencies on particular sites -- if a building becomes unavailable, it would only be an inconvenience. Teams would re-form via text chat groups fairly fluidly. There probably isn't any crucial data living in a file server that would need to be restored in a hurry. Accounting and other functions are all in the cloud and therefore somewhat insulated from any local disasters. But that very fluidity that makes recovery so natural means that it would be very hard to tell what unexpected consequences there might be.

I'm painting with a broad brush -- not everyone is young, there are still emails being sent (occasionally), phones would be ringing somewhere, there are bound to be many unimportant meetings, and even some important meetings being left completely unrecorded, and sadly, there are probably lies being told as well. So it's not universal or absolute, but the approach to and adoption of technology does drive culture in a certain direction.

It's certainly been one of the most interesting clients I've worked with. I can't help shaking the feeling: if I found working in a predominantly Gen-Y company noticeably different, what are Gen-Y workers experiencing when they work for companies that are predominantly baby-boomer or Gen-X? And am I seeing an amusing niche event that has just happened for this time and this place or have I been seeing the future of work for everyone?

Greg Baker (gregb@ifost.org.au) is a consultant, author, developer and start-up advisor. His recent projects include aplug-in for Jira Service Desk which lets helpdesk staff tell their users how long a task will take and a wet-weather information system for school sports.

Data Protector built-in reports

There are four ways of getting reports out of HP Data Protector:

Using the built-in reports (listed below)
Querying the internal database directly (e.g. by setting up an ODBC connection to the PostgreSQL server on port 7116)
Running commands like omnidb and then massaging the results
Buying a copy of HP Backup Navigator -- note that HPE are currently running a Free Cloud-Hosted Trial of Backup Navigator which you can use even if you are behind a firewall.

The following list shows the built-in reports, which can be automatically emailed, logged, put on an intranet, broadcast as a Windows popup, and piped through another program. If the report you need is on this list, it will be very quick to set up. If it is not on this list, then the least effort will be to get a copy of Navigator and use that instead.

I have mostly copied this from the CLIReference.pdf documentation.

Configuration

Cell Information: A count of the number of clients, and some details about where the media management database is (local or remote).
Configured Clients not Used by Data Protector: Lists configured clients that are not used for backup and do not have any device configured.
Configured Devices not Used by Data Protector: Lists configured destination devices that are not used for backup, object copy, or object consolidation at all.
Lookup Schedule: List of backup, object copy, and object consolidation specifications that are scheduled to start in the next n number of days up to one year in advance (where n is the number of days specified by user).
Clients not Configured for Data Protector: List of clients in selected domain(s) that are not configured for Data Protector. Note that Data Protector will display also routers and other machines that have IP address in selected domain.
Licensing Report: Lists all licenses and the available number of licenses.
Client Backup Report: Report output is all end-user backup related information about a specific client: list of filesystems not configured for selected clients, list of all objects configured in backup specifications for the selected client, list of all objects with a valid backup for specified client with times and average sizes. Note that Client Backup reports do not include information about application integration backup objects and backup specifications.

Internal Database

Internal Database Size Report: Provides a table that contains information about the MMDB, CDB, archived log files, datafiles, and information for DCBF and SMBF.

Pools and Media

List of Pools: List of all media matching the search criteria. The following information is provided for each medium: ID, label, location, status, protection, used and total MB, the time when media was last used, the media pool, and media class.
Extended List of Media: List of all media matching the search criteria. The following information is provided for each medium: ID, label, location, status, protection, used and total MB, the time when media was last used, the media pool and media type, session specifications that have used this medium for backup, object copy, or object consolidation, as well as the session type and subtype. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.
Media Statistics: Reports the statistics on the media matching the search criteria. The following information is provided: number of media; number of scratch media; number of protected, good, fair and poor media; number of appendable media; and total, used, and free space on media.
List of Media: Lists all pools matching a specified search criteria. For each pool the following information is provided: pool name, description, media type, total number of media, number of full and appendable media containing protected data, number of free media containing no protected data, number of poor, fair and good media.

Session Specifications

Trees in Backup Specification: Lists all trees in the specified backup specification. It also shows names of drives and the name of a tree.
Objects Without Backup: Lists all objects, specified for backup in selected backup specifications, which do not have a valid backup. A valid backup means that the backup completed successfully and its protection has not expired. For each object that does not have a valid protected full backup, the following items are shown: backup specification, an object type, an object name and a description. Only objects from the selected backup specification are used for the report. If HOST object is used: Host object is expanded (get disks) and report checks that expanded objects are in database. UNIX and Windows filesystems are supported. This option is not available for backup specifications for integrations.
Object's Latest Backup: Lists all objects in the IDB. For each object, it displays the last full and the last incremental backup time, the last full and the last incremental object copy time, and the last object consolidation time. Objects of the Client System type (host backup) are expanded; it means that the information is listed for each volume separately. As for objects of the Filesystem type (filesystem objects), only the UNIX and Windows filesystems are supported.
Average Backup Object Sizes: Lists all objects, specified for backup in selected backup specifications, which have a valid backup. A valid backup means that the backup completed successfully and its protection has not expired. For each object average full and average incremental backup size is displayed. If HOST object is used: Host object is expanded (get disks) and report checks that expanded objects are in database. UNIX and Windows filesystems are supported.
Filesystems Not Configured for Backup: Displays a list of mounted filesystems which are not in selected backup specifications. Output is a list of filesystems. If HOST object is used, the report will not report any disk from client as not configured (assuming that HOST backup will backup all disks). If HOST object is used, the report will not report any disk from client as not configured (assuming that HOST backup will backup all disks).
Session Specification Information: Shows information about all selected backup, object copy, object consolidation, and object verification specifications, such as type (for example, IDB, MSESE, E2010), session type, session specification name, group, owner, and pre & post exec commands. Host does not influence the report. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, object consolidation, or object verification specification.
Session Specification Schedule: Shows information about all selected backup, object copy, object consolidation, and object verification specifications and their next scheduled time up to one year in advance (type, session type, session specification name, group, next execution, and backup operation time). HOST does not influence report. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, object consolidation, or object verification specification.

Sessions in Timeframe

List of Sessions: Lists all sessions in the specified time frame. The report is defined by set of options that specify report parameters. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, object consolidation, or object verification specification.
Session Flow Report: Graphically presents duration of each session specified in certain time frame. Flow chart of the backup, object copy, object consolidation and object verification sessions matching search criteria is shown. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, object consolidation, or object verification specification.
Device Flow Report: Graphically presents usage of each device. Flow chart of the backup, object copy, and object consolidation sessions matching search criteria is shown. If you set the RptShowPhysicalDeviceInDeviceFlowReport global option to 1, the same physical devices (presented by their lock names or serial numbers) are grouped together. If there is no lock name or serial number specified, the logical name is displayed. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.
Report on Used Media: Lists destination media that have been used by backup, object copy, and object consolidation sessions in the specific time frame together with their statistics. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.
Extended Report on Used Media: Provides extended information on destination media that have been used by backup, object copy, and object consolidation sessions in the specific time frame, as well as the session type and subtype. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.
Client Statistics: Lists of clients and their backup status - only clients that were used by the backup sessions matching the search criteria are displayed.
Session Statistics: Shows statistics about backup, object copy, and object consolidation status in the selected time frame, limited to sessions matching the search criteria. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.
Session Errors: Shows list of messages that occur during backup, object copy, object consolidation, and object verification sessions in the specified time frame for selected session specifications. The messages are grouped by clients (for all selected clients). By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, object consolidation, or object verification specification.
Object Copies Report: Lists object versions that are created in the specified time frame with the number of their valid copies. The number of copies includes the original object version. By default, the report is generated for all session specifications. Use the report filtering options to generate a report only for a specific backup, object copy, or object consolidation specification.

Single Session

Generally, these reports only make sense to run at the end of a backup. You wouldn't put these into a scheduled report.

Single Session Report: Report displays all relevant information about single Data Protector backup, object copy, object consolidation, and object verification sessions.
Session Objects Report: Returns all information about all backup, object copy, or object consolidation objects that took part in a selected session.
Session per Client Report: Provides information for each client that took part in the selected backup session: statistics about backup status for the client, list of objects and their related information for the client, error messages for the client.
Session Devices Report: Provides information about all devices that took part in a selected session.
Session Media Report: Provides information about all destination media that took part in a selected session.
Session Object Copies Report: Lists object versions that are created in the selected backup, object copy, and object consolidation session with the number of their valid copies.

Greg Baker is an independent consultant who happens to do a lot of work on HP DataProtector. He is the author of the only published books on HP Data Protector (http://www.ifost.org.au/books/#dp). He works with HP and HP partner companies to solve the hardest big-data problems (especially around backup). See more at IFOST's DataProtector pages at http://www.ifost.org.au/dataprotector, or visit the online store for Data Protector products, licenses and renewals at http://store.data-protector.net/

Patches for Data Protector 7.03

If you are still on version 7 and haven't yet enjoyed the benefits of the new Data Protector internal database, you might be interested to see the throng of bug fixes that HP has just released.

You also might want to think about your migration plans, as this is now the oldest supported version. You can jump straight to version 9. You will have to get new license keys, but if you have a support contract these are free -- it's just a matter of visiting http://webware.hp.com/ and organising them. I'd suggest reading my book on migrating and obsoleting cell managers to get some thoughts together about how you might do this.

Greg Baker is one of the world's leading experts on HP Data Protector. His consulting services are at http://www.ifost.org.au/dataprotector. He has written numerous books (see http://www.ifost.org.au/books) on it, and on other topics. His other interests are startup management, applications of automated image and text analysis and niche software development.

Other related sites

I lecture in natural language processing, and research non-manifold machine learning. I consult to companies needing help with technology management, AI strategy.
I have built a prediction market platform for enterprises to help with employee engagement and strategic decision making.
I wrote this NPS survey analyzer
As a build-a-business-in-one-day exercise, I have a GPT-via-email bot.
Conference call system to help when you need simultaneous translation done, but the only equipment you have on hand are people's mobile phones... which aren't all smartphones: church-translation.com
My Amazon author page
In the past I used to answer a lot of questions on Quora
Early 21st century pre-singularity geek-nerd poetry also available in book form

Search This Blog