As you might know, I like to dig into parts of EMC's portfolio that I think are especially cool, and share with you what I really like about the different pieces.
Somewhere in all the activity last week, there was a major announcement around RecoverPoint, our next-gen replication platform.
Not only is RecoverPoint cool technology, but it's a cool story as well.
The Product That Never Should Have Been
Conventional wisdom on EMC (as well as other large tech companies) is that we protect our product franchises, are slow to move to new technologies, don't eat our young, are ripe targets for disruptors, and so on.
There are even a few bloggers out there that seem to make a living espousing this view -- incorrectly, in EMC's case.
If you're familiar with EMC's history, you probably know that one of our "franchise products" is SRDF, the de-facto standard for high-end remote replication. SRDF was shortly joined by TimeFinder for local replication (also a franchise product). Together, they worked together to establish a clear market leadership in these categories, and -- to this day -- do things that other products simply can't do.
Both products ran in the array (a Symmetrix in this case), and nowhere else. If you wanted the good replication stuff, you had to buy a Symmetrix to run it on. Although many people thought this was a marketing conspiracy, even a simple tour of the technology will show you that part of its effectiveness was its tight integration with the underlying array architecture, so it couldn't really run anywhere else.
That was its strength -- and its weakness.
A few years back, EMC acquired Kaysha that offered a very different approach to replication -- CDP. Rather than focus on exact, synchronous copies of data, or 'splits' made at an infrequent point in time, CDP offers continuous capture of changes that can be rewound -- across applications -- to any point in time. It could do this locally, or remotely.
And, more radically, it could do this on any array, not just EMC's.
I think a lot of conventional wisdom was confounded when we acquired them, and said that we had great hopes for this technology. A few cynics offered that we bought them just to bury them.
Some people just don't understand us, do they?
Is CDP Getting A Bad Rap?
When we announced the original acquisition, CDP technologies were the darling of the storage industry. However, when we recently announced major enhancements to RecoverPoint (our CDP offering), there were multiple notes of cynicism offered by much of the press, not around our product per se, but around CDP adoption in the marketplace.
There's no helping some people through the disappointment of their inflated expectations, but -- from our perspective -- it's doing very nicely in the marketplace, and -- more importantly -- when you talk to customers who are running it, they tell me they really, really like it.
From my perspective, the rest will happen in time.
What We Announced
If you read the announcement, you probably saw a big deal being made about where the "splitter" was being run. Now, if you don't have a background in this stuff, you're probably wondering what all the fuss was about.
Simply put, the "splitter" is the small piece of logic that picks out which pieces of data might need a copy, makes the copy, and sends the different copies off for processing.
As an example, if I have CDP enabled for my Oracle database, the job of the "splitter" is to intercept every write, make a copy for the CDP log to potentially rewind at some point, and let the original pass through to its destination.
But, like anything else, small things can have large impacts, especially in larger environments. And, for many customers, it really matters where this "splitter" runs in the environment: server, switch or array.
With RecoverPoint, the splitter can run on the server or host. This is nice in smaller environments, or less demanding environments, as it keeps the costs down. But if you're looking at a large number of servers participating, or if you're especially performance-conscious, this won't be ideal for you.
Another place to put the splitter is inside an intelligent switch from Cisco or Brocade. In this approach, your servers aren't touched, and the dedicated processing that the intelligent switch brings to the table makes for a very performance-oriented solution.
But, alas, not everyone can justify the cost and complexity of a spiffy new intelligent switch in their environment (although a surprisingly large number of people have).
And that leads us to a third option -- putting the splitter software in the array itself (a CX in this example), that provides advantages over the host-based approach, but doesn't require an intelligent switch.
But What's The Best Approach?
One of the most frustrating conversations in IT is finding the "best" approach. People look at something like this, with three distinct options, and ask "what's best?".
You probably already know the answer -- it all depends.
It's not hard to imagine a customer environment that might be using all three approaches -- in different parts of the landscape -- to achieve different objectives, but still doing it with a common technology base, management interface, and so on.
And, just to be clear, it's not tied to any of our arrays (except the CX splitter approach, of course).
CDP and CRR Merge.
A bit of acronym soup here, if you don't mind.
I mentioned before that CDP referred to continuous data protection. For some reason, that particular term has been associated with local replications, e.g. in the same array, or at least co-located copies of data. The term CRR (continuous remote replication) is used to describe the distance-oriented flavor of CDP.
Please don't ask me why this is, I don't know. I tend to use CDP to describe both, since it's basically the same idea, the only difference being the length of the wire involved ;-)
Also in the announcement was that RecoverPoint had integrated CDP and CRR into a single, combined capability.
If you're into protecting critical applications, this is relatively big.
The whole idea here is to protect against data corruption at multiple levels by being able to rewind from a log of all updates against a "known good" state.
It's often the case that customers want two copies of this log -- one at the local site, and one at the remote site.
Why? If you're recovering from a data corruption problem, speed matters. And sucking all those recovery bits over a skinny wire might take longer than you want, especially if your business is waiting for you to get things up and running again.
Local copies are far faster for recovery. They also make it more convenient to do things like integrity checking (e.g. making sure your known good copy is really good).
But, if you're doing "real" DR, you need a remote copy as well as a local one.
Setting up this local and remote combination in the past was cumbersome, to say the least, but we did it. With the new version, there's a new use case defined that assumes you might want to do that, so it's easier to set up, easier to manage, easier to recover, and so on.
If you watch this space closely, this is unique -- and useful.
And, Of Course, There's VMFS Support
I can't imagine that there will be too many product announcements in the future -- from any vendor -- that don't have the obligatory VMware support statement, and this one has it as well.
I'd like to say more about RecoverPoint and VMware's upcoming SRM (Site Recovery Manager) integration, but I can't, as it's not entirely kosher to talk about publicly.
But, if you're looking at local and remote replication for your VMware environment (and you should be), you're going to want to take a close look at how these pieces come together in a very cool, slick manner.
When it's announced, that is.
CDP Isn't For Everyone
Not all IT shops can justify "Tivo for your data center" -- the ability to rewind an arbitrarily complex set of applications to a known state, and start the movie over again.
I've seen where having a "bad data corruption day" can force the decision, although that's not the ideal way to realize you need something like this.
And I've also seen environments where multiple applications interrelate (e.g. SOA or similar) and getting a consistent view of everything for recovery purposes might force a new perspective on the matter.
Or, just people who'd like to do their replication in the network, rather than inside the array, and prefer this from an architectural perspective.
However you get there, I think CDP is one of those data protection topics that bears watching, simply because the protection models it offers are better than the ones that came before it.
And, hopefully, maybe like me, you'll come to appreciate just how cool this stuff really can be ...

Comments