On an average week, I get to meet with several different customers. It's fun and it's rewarding.
Sometimes they've seen this blog, which is a strange feeling when they know more about you than you know about them ...
More often than not, the purpose of my visit is to paint a very broad picture. We'll talk about all sorts of technologies: virtualization, content infrastructure, model-based management, security -- sometimes we even talk about storage.
Whether it's 30 minutes or a day-long drill-down session, it's always fun.
Other times, there's a specific topic someone wants to drill into. Such was the case last week.
This particular picture was increasingly familiar: an IT organization getting clobbered by backup and recovery woes.
Ostensibly, we were there to pitch our capabilities. The IT guys were doing the round-robin of different vendors, and it was our turn.
But, in a search for root cause, we ended up in a very different place.
The Picture
Simply put, their backup picture was spiralling out of control.
File systems were growing at least 2x per year. The same files were ending up in email, and it too was having its challenges. And most of the business ran on a large SAP instance which was growing substantially.
When we got into the SLA discussion, they shared that SLAs delivered to the business was more based around what their technology could do, rather than what people needed to run the business.
Tail wagging dog.
They'd gotten into the habit of using EMC's disk libraries to solve tactical service level problems. As you probably know, an EMC disk library looks just like a tape library, only significantly faster, and the implementation is pretty much a drop-in.
But that easy fix had masked other problems, and now it was time to dig in and understand them.
What They Wanted
Usually, when you start talking to a customer about their backup environment, it's a mixed picture.
On one hand, they'd like to save money and make operations easier.
On the other hand, they know they've got demanding users who won't tolerate poor SLAs on either backup or recovery.
And, of course, to muddy the waters, there's a bunch of new technologies coming into this space: using low-cost disks, single instancing, compression, newer forms of data deduplication, and so on.
Bringing archiving into the discussion can make things even more confusing, as archiving serves multiple purposes in most organizations, and usually ends up being much more than a cheap place to store information.
It would have been easy enough to start positioning different technologies, but I decided to start swimming upstream and looking for root causes.
The resulting discussion was worthwhile, so I thought I'd share it with you.
Do You Archive?
It sounds simple, but most backup environments are overwhelmed by stale and/or low-value data.
Common practice is to periodically do full backups in between a sequence of incrementals, even though only a small amount of information has changed.
Long, long backup windows impact users and frustrate IT operations. And bloated production images cast incredibly large shadows on the backup environment: anecdotally, there's anywhere from a 2.5x to 8x multiplier between the amount of information in production, and the total amount of information in the backup environment.
Yes, tape media is cost-effective, but handling it isn't. Waste enough it, and even the media gets expensive.
Simple idea: archive before you backup.
Get the low-value stuff out.
And shrink your backup / recovery footprint by an order of magnitude.
It's not like archiving for file systems, or email, or SAP etc. is radically new. The basic technologies have been in the marketplace for years. There are literally thousands of people doing this stuff.
And, not surprisingly, archiving out the gloop makes the backup environment an order-of-magnitude easier to handle.
So, when I asked these guys that question, I got a strange response. Turns out that they had tried archiving with file systems a while back, but the project had stalled.
Wasn't really clear why, so I wanted to dig a little farther.
Do You Tier?
Any information archiving strategy is predicated that you've done some basic tiering work.
You've established some basic classes of services (performance, availability, RTO/RPO, retention, etc.), and assigned user-visible costs to the service levels.
You've had the basic discussion with the application users as to what they think is warranted for their information, and then you go build out the technology (and the management) behind it.
And archiving software is what implements your policy against the tiers you've defined.
Over the last four years, I have come to believe that this sort of tiering activity is the rosetta stone between business and IT when it comes to storage and information management issues.
Why? It forces the honest discussion around decisions and costs. Even if you don't explicitly charge back to the business, people should know what it costs to store/protect/manage their information, so they can make the right choices.
Every customer I've met that has done this work has used the results to change the game: driving costs down and efficiency up.
And every customer I've met that hasn't done this work continues to struggle to find workable solutions in an environment where all information is created equally.
And that's the problem. If you don't create some tiers and associated characteristics, you're usually forced to treat most of your data as tier 1 (e.g. very expensive).
That seemed to be the situation here. They hadn't had the basic tiering discussion with their application users. Most everything was top tier. And -- not surprisingly -- not only did that drive additional storage costs, but it was overwhelming their backup and recovery environment.
I think there are literally thousands of IT-owned service level charts out there in the world today that describe the different classes of information, and how they need to be stored/protected/managed.
Some are very complex. Some are very simple. But they all do the same thing -- they force the discussion of choices and associated costs.
So what was holding them back?
Do You Talk To Your Users?
I felt some reluctance when we got into this tiering discussion. Nobody said anything, but I came away with the impression that the IT guys weren't in the habit of sitting down and talking to the folks who were generating all that information.
Well, that's going to be a problem, isn't it?
Unless your strategy is to wait for a major disruption or crisis to force the conversation, the only way to get ahead of this is to get out of your office and go meet a few people.
Sometimes, I see IT organization waiting for the Official Policy to be handed to them on stone tablets from the higher powers. I am very vocal that this is the new job of IT, and -- although there needs to be different organizations at the table -- IT ought to take the lead in driving the activity that results in an implementable policy.
Look, this doesn't have to be some formalized initiative sponsored by the Office of the CEO. All it takes is finding a few business leaders who can speak informally for the major information generators and users.
And if you haven't done this yet, I'd suggest doing it sooner than later.
And Then We Talked Technology ...
There's a lot of stuff out there to talk about in the backup/recovery/archiving discussion. And I've gotten into the habit of using a simple framework to position what's there, and what's not.
The first framework is the essential three-way-split: files, email, production databases.
Each one should have a lightweight notion of tiering (performance, availability, recovery, retention, downstream uses). Yes, it'd be nice to treat it all as a single topic, but the business needs for each informatin type are different, and the candidate technologies certainly are different.
I'm a fan of breaking the problem into pieces. Pick a place to start, and get going.
Another decision that ought to be considered early on (but usually isn't) is whether your archive is just going to be a cheap place to store data, or whether or not you see people actually using it to support newer requirements.
All of these tiering/archiving approaches will need some sort of target technology. This is one place where I argue loudly that you ought to look at common requirements, otherwise you'll end up with a plethora of archiving storage stacks, and it'll be hard to manage and exploit what you've created.
As an example, if your rationale for archiving email is simply to reduce costs, any cheapo storage will do. But if there's a legal retention requirement, or you're struggling with FRCP requirements, you'll be interested in a few nifty features that support that. Ditto for files, and databases, and ... hence the need to have the tiering discusssion talk about downstream uses for the lower-tier information.
The File System Discussion
EMC's DiskXtender seems to be a great starter approach for basic file classification and archiving. The product has been out there for many years, has thousands of happy users, goes in pretty quickly, works with most everything you're likely to own, and so on.
No drama here, folks.
However, with the growing popularity of file virtualization (e.g. EMC Rainfinity), it's starting to look more attractive to put the policy and movement engine at the virtualization layer, rather than on every file server, or every back-end NAS device.
That being said, DiskXtender can only work with external metadata: name, creator, last access time, frequency of use, etc.
Some people look at it and say "is that all?". Now, ideally, you'd be able to crack open files and do a bit more sophisticated analysis, which is what Infoscape does.
But, if you haven't even started to implement file system archiving, my suggestion would be to do yourself a favor and start with something that's a bit more digestable at the outset, and can get you to a quick win.
Single-instancing makes sense for many file systems. Yes, shockingly, you'll see the exact same Powerpoint all over the place. And using something like Centera as the back-end for your archive gets you single-instancing in a pretty slick manner.
Although there's not a lot of call for managing retention for files the way we have to do with emails today, you can see it coming, so it might make sense to plan ahead.
Simply put, a simple DiskXtender implementation could take a major bite out of overall file system storage costs, as well as hack down the backup requirements by an order-of-magnitude or more.
Now, how to backup up what's left of the file system?
Well, it's no surprise that I'm a huge Avamar fan these days.
Its ability to do global dedupe means that if it sees a chunk of data anywhere at anytime that it's seen before, it doesn't get backed up -- only a pointer to the original chunk is sent and stored.
Backups get done much, much faster. Network requirements are slashed to a fraction. The amount of data stored is a miniscule subset of the amount that would be stored with traditional incremental backups. The data is stored in native file system format, which means it can easily be mounted and used directly, which opens up all sorts of interesting possibilities.
Got remote sites? Excellent fit.
Building out a VMware ESX environment where everything is a file, and there's a ton of redundant data? Excellent fit.
But the basic recipe for file systems is the same:
1 -- create a services catalog and share with business users
2 -- implement file system archiving to get the gloop out, while keeping users happy
3 -- choose your archiving target with an eye to the future
4 -- back up the remainder with the technology that works for you
I think I got agreement from this customer that file system archiving might be a good place to start -- it was one of their most painful problems, and it wasn't going to solve itself anytime soon.
The Email Discussion
Email issues are a bit different than file system issues. First, everything is stored in a database, so different tools are required. Second, there's a pronounced corporate interest in saving certain emails for a defined length of time, and hopefully retrieving them quickly in the event they're needed. And finally, there's a whopping big opportunity for eliminating redundant data associated with attachments.
But the basic recipe doesn't change.
The services catalog discussion is a bit different, because the legal department needs a seat at the table for this one.
Probably a seat at the head of the table.
They've probably got very specific concerns as to what needs to be saved, and what shouldn't. And they've got a strong vested interest in making sure that it's managed according to policy, and when they need something, they can get it fast and accurately.
More often than not, they're willing to contribute (e.g. fund) portions of an email management project.
The email archiving software is pretty straightforward. In EMC's portfolio, it's EmailXtender. Again, very mature stuff, thousands of users, works with everything you're likely to have, no surprises, etc.
As far as an archiving target, the case for using something like a Centera is even stronger. Why? It can support the compliance-related aspects of records management. Things like proving in a court of law that the email hasn't been changed since it was archived. Or perhaps automatic deletion at the end of the defined retention period. Or perhaps metadata search.
Oh yes, and you get single-instancing as well.
Anecdotally, I've talked to dozens of customers who were shocked (shocked!) at the amount of duplicate attachments in their email environment, and the resulting capacity savings by simply implementing off-the-shelf single instancing with Centera.
So how do you back up the remainder?
Well, if you've archived first, what's left is the very recent email that's "hot" within an organization. So high service levels matter (as you've defined in your service level catalog).
In the EMC portfolio, this either translates to (a) using Networker with disk as a target (think a disk library), or (b) using snaps to get a instantaneous clean copy, and moving to tape (again with Networker).
Either way, you're now dealing with an email environment that's a small fraction of what you used to have.
When I brought this up to this particular customer, it turned out they were planning a lengthy email migration, so they didn't really have an appetite to introduce yet another variable into the mix.
Made sense to me, as there were other places to get started.
The ERP Discussion
In this case, it was SAP, and they used it to run the core of their operational business processes. And, not suprisingly, the production instance kept getting bigger, and bigger, and bigger ...
Not only that, but -- as with most SAP landscapes -- there was a tangled cloud of SAP-related infrastructure: development, test, pre-production, training, decision support -- the list got quite long.
Let's start with the production environment.
SAP has some basic archiving capabilities available, but -- by itself -- it's not a complete solution. The SAP archive object looks like a binary blob -- it's not easily usable by either applications or users. Nor do the SAP archive tools do a good job of helping you identify likely candidates for archiving.
In EMC's portfolio, we offer Viewpoint, which fills these gaps. It has front-end tools for analyzing your environment, and -- more importantly -- it works with the SAP archiving blobs so that applications and users can actually get some value from them. It's an OEM from PBS, but for growing SAP environments, it seems to do the trick very nicely.
If you're a public company, some of this archived information will need to be records-managed, and maybe even managed in a compliant fashion. So, once again, Centera might make more sense than just garden-variety cheap storage.
Back to my customer example, they'd found the PBS product, and had started a project around it, so that was good.
So what about the test / dev / preproduction / training environment?
Two things going on here. First, there's usually a need to rapidly re-set the environment back to a known state. And disk-based replications (snaps, clones, etc.) seem to pay their freight in terms of speeding up projects by a significant factor. In the EMC portfolio, think TimeFinder, SnapView, Replication Manager, et. al.
But there's also a need to keep an archive of past test beds, etc. Some organizations may even need to keep these things around for quite a while, due to SOX compliance requirements.
If all you want to do is backup older versions of your test bed (and don't have a compliance requirement), once again Avamar looks pretty attractive, mostly due to the fact that there's a *ton* of redundant data within these environments.
Yes, you'll need enough capacity in the Avamar appliance to make sure things don't get overwritten. But I would expect that the data redunction you'd see would tend towards the high end of the scale, certainly greater than 50:1 or even more. And it doesn't take much data reduction before disk becomes cheaper than tape.
But Avamar doesn't support the long-term retention and compliance requirements that you might need for SOX, so there's probably going to be a need to send some subset of this backup to Centera for long-term management.
And what about decision support?
The whole notion of archiving doesn't really play well in this space, at least by my estimation. The purpose of a decision support environment is to reach back in time to support business analysis, and response times do matter. So archiving as a strategy to reduce storage costs (and backup footprint) isn't a strong fit, at least for the present.
Typically, our solution here is scale-out CLARiiON CX3s, usually with big (cost-effective) drives.
Backup and recovery requirements also seem to be all over the map. Some businesses use their BI/DW environment for near-real-time operational support, so high-performance solutions (e.g. disk as a target) are preferred. Other customers can tolerate a lengthy backup window (or restore) so they're sticking with things like Networker and tape.
Putting It All Together
I think it's worthwhile to point out some broader themes.
One way of casting the situation is that, very often, IT hasn't made the transition from technologist to informationist. They're sitting on the sidelines of a very important discussion around who owns information, who understands it, and who leads the business in defining strategies on how to manage it effectively.
They say that stuff rolls downhill in large organizations.
And I think that the crisis-sized backup and recovery issues that are becoming more frequent are merely the symptom behind a much deeper set of issues.
There will be those that decide to swim upstream, and get to the root causes. It will be more work, but produce ultimately better results.
And there will be those that simply decide to throw bigger/faster/newer hardware at the problem.
As an example, you probably saw our recent announcement of a DMX-based disk library. Yes, there are people whose size and scale will justify such a device, even after they tier, archive, etc.
But my real concern is that we'll unfortunately find a healthy market of people who just want a bigger hammer.

Comments