In the beauty pageant of sexy IT topics, data protection doesn't usually fare well in garnering attention.
Not many IT types get excited about providing good data protection, and all that it entails. Since most business users think data protection is something that happens automagically, there's rarely enough budget and focus to fully address the challenges at hand.
That is, until they lose their data and can't get it back.
That's unfortunate, because this world is moving fast. On the "things that need to be backed up" side, we have ever-more applications, information types and information quantities.
It seems there's never enough performance: no one wants to wait for a backup, let alone an urgent recovery. Soaring data volumes meet compressed budgets, and compromises are inevitably made.
The core technologies are changing: witness the emergence of PBBAs (purpose built backup appliances) as well as various forms of snaps and replication. And the delivery model is changing like everything else in IT: data protection wants to be delivered as an easy-to-consume service (DPaaS) with fully visibility and transparency.
Today, EMC's BRS (Backup and Recovery Systems) announced a slew of product enhancements -- quite an inventory. While the products are quite interesting, the thinking behind them is perhaps more compelling.
The Accidental Architecture?
Ask any good-sized enterprise what products they're using to back up and protect their environment, and you'll sometimes get the joking answer: all of them.
On a more serious note, it's rare when you find a standardized enterprise-class backup architecture with data protection services consistently exposed and consumed.
The reality is more usually a collection of different approaches that made sense at the time: either built by data protection team, or more likely by well-meaning people who took matters into their own hands.
The technology certainly exists to change this state of affairs. The business justification to do so is almost always there. But, for some reason, this situation stubbornly persists -- perhaps caused by inertia with a healthy dose of unmanaged entropy.
That being said, people do appear to be investing in newer solutions.
Against a rather tepid storage landscape, IDC recently pointed out the PBBA category is not only quite large, but growing more quickly (16.5%) than apparently any other portion of the storage market. EMC's DataDomain rules the roost in this category, with some 60% of the $679 million sold in Q1 of 2013.
No, I don't know what the tape market looks like, but I bet it's going in the opposite direction.
Towards An Architectural Layering
A part of EMC's announcement included this chart, which does a decent job of creating the layering that's needed here.
At the top is what I would refer to as the "service catalog layer" which is labeled data management services here. What protection services are available, how are they invoked and made easy to consume, how are they monitored, costs allocated, etc.
At the left, integration with various data sources. The taxonomy here includes four major types: virtual servers, physical servers, primary storage and application-specific.
And at the right, "protection storage" -- a shared, consolidated (and deduplicated!) pool where all protection-oriented data can be landed.
In the EMC portfolio, the service orchestration is provided by Data Protection Advisor -- an extremely popular (and vendor-agnostic) layer over just about any set of protection technologies. The various data sources are integrated by a combination of Avamar, Networker or specific application-specific tools such as Oracle's RMAN.
And the protection storage is DataDomain -- the focal point of this particular announcement.
From Data Protection To Archiving
Just as disk is quickly supplanting tape as the preferred medium for data recovery, it also has long been the preferred medium for extended retention archiving, especially when you think you might want to use something in a timely fashion.
One of the bigger thoughts that EMC's BRS division has been working towards is creating a single resource platform for both needs: the ability to scale shared backup services for everything in the data center, as well as offering the attractive economics of a long-term retention platform.
While it's true that many long-term backup images are essentially de-facto archives, there's more to the story than that.
Archiving is most useful when integrated as part of an application: a content management system, a database, a file share, email environment, collaboration portal, and so on.
Users should still be able to see and interact consistently with their data -- regardless of age.
Part of what's driving the need for ever-larger DataDomain capacities is this new use case. Popular archiving apps see the DataDomain as a CIFS share; the DataDomain simply provides very cost-effective long-term archival storage.
And Now, The Products
The entire midrange of the DataDomain product line has now been substantially refreshed to be even more competitive.
Thanks to Intel's Sandy Bridge, all of the updated DD boxen are seriously faster, bigger, handle more streams -- and are more cost-effective than before. The magic is DD's multi-core software implementation -- one of the reasons that EMC was so attracted to this company -- so that every time we get a significantly better Intel processor, the products get significantly better speeds-and-feeds as a result.
Here's the new family portrait -- along with a few hints on decoding what you're seeing here.
The backup speeds vary significantly as to whether or not you're using DD Boost -- a technique to move some of the deduplication grunt-work out towards the backup servers and use their cycles. Usable capacity is best thought of as physical capacity -- the amount space you have to actually store stuff. More useful are the logical capacity numbers, which is what you can expect to see with normal rates of deduplication.
Interesting Bits And Pieces
There's quite a lot in the announcement materials, and I'm just going to go with the nuggets I found interesting.
One thing that jumped out was Isilon support for the Avamar NDMP Accelerator -- basically, an appliance running the Avamar client and mounting the NFS export.
While this is interesting in its own right, more and more Isilon is starting to show up for Hadoop HDFS use cases.
Although not featured as a solution in this launch, I believe this might mark the first commercially-available backup approach for HDFS. And, yes, sooner or later people are going to want to back up that data as well.
While we're talking about Avamar, the level of integration between the Avamar client and the back-end DataDomain system has been improved in a number of areas.
One very intriguing feature involves VMware integration: not only can VMware admins do self-service recovery from the vSphere Web Client, they can (if I understand this correctly) actually boot off the image resident on the DataDomain device, do a bit of investigation, and then use something like Storage vMotion to bring it back into the production environment.
There's a whole slew of NetWorker enhancements -- way too many to list here.
Finally, the popular Mozy cloud backup service also gets some love with this announcement.
Several years ago, the focus was changed from a consumer backup service to something more befitting an enterprise IT user: an enterprise-class backup service for smaller businesses, or in some cases laptop/desktop backup for larger enterprises.
The enhancements here are (1) Active Directory integration for organizing groups, (2) the ability to manage a large, shared pool of backup capacity vs. individual accounts, and (3) a simplified provisioning mechanism.
I can't tell you how many times I've heard a customer tell me "we have a tough time convincing the business to invest in a better backup approach". I'm not that sympathetic. We're not talking about a cool mobile app, or advanced data analytics here -- we're talking about basic IT hygiene.
Data protection is one of those disciplines that's completely owned by IT: and it's not really anyone else's problem to go solve. Many people make a compelling case for investment just on reduced opex -- without even delving into the risk reduction side of things.
And the problem isn't going to go away by itself ...
Like this post? Why not subscribe via email?