Let me know what you think?
Yes, But ...
Here's where I would agree with the thought.At an industry level, it's pretty clear that efficient use of disk is starting to replace (or augment) tape for just about everyone. Disk technology is getting far cheaper and far more efficient, and replacing tape at a surprising rate.
You see evidence of this everywhere. I remember someone showing me how the enterprise tape library market has started its free-fall from a significant product category to an interesting niche over just the last few years.
Second, much of the industry's thinking around data protection (specifically backups) has clear roots in tape-based models. Full vs. incrementals. Tape volumes and rotation schedules. We're talking many decades of process and thinking around tape. And that doesn't change overnight.
But, to be clear, at some point, the fact that we're copying information from one hunk of random-access storage media to another form of random-access storage media begs the question of how much of that legacy thinking do we need to leave by the wayside as we go forward?
This leads to a view that -- over time -- the preferred data protection model will probably be "versioned replicas" -- time-stamped images of what has changed, all stored very efficiently. Whether we call these snaps, clones, remote replicas, continuous data protection, or even the underlying dedupe capabilities of DataDomain and Avamar – you can see that line of thinking across our portfolio.
So far, we're all in agreement.
Now, let's throw some real-world considerations into the mix.
Not Everyone Is Up For Process Change
When any new technology shows up, there are those enterprises that are in a position to change their processes and models around the technology, and those that aren't. EMC learned long ago that for any new technology to be successful, you shouldn't insist on an immediate process change.
In this world, "no immediate process change" means "backup to disk" in all of its various forms. Still use what looks like a traditional backup model, just do it with faster/better media and software that understands the differences.
The net? Customers get a big hunk of benefit without significant process change, which can be deferred to later.
Orchestration MattersRegardless of how you're protecting your information, you want it to be a well-managed process: jobs are scheduled, workflows automated, media managed, results logged, exceptions noted, trend lines reported on, and so forth.
Ideally, this orchestration level would be relatively agnostic to how the underlying copying was being done: traditional, snaps, replication, etc. The job of assuring "all the data is protected, and I can prove it" should be somewhat independent of the underlying mechanisms.
Besides, there's only so much you can do with scripts :-)
The Continuum Matters
If you think about it, the whole backup discussion -- while reasonably broad -- is actually sandwiched between the topic of business continuity on one side, and long term archiving on the other side.
Not to oversimplify, but shorten your RTOs and RPOs and add distance -- it's a business continuity discussion. Lengthen your RTOs and RPOs, it ends up looking more like an archiving discussion.
In any decent-sized enterprise, there will be clear calls for all three discussions -- business continuity, more traditional backup-oriented data protection, and longer-term archiving.The implication for me is clear -- if at all possible, have the backup and data protection discussion in the context of the other two -- common platforms, processes, etc. And, fortunately, we're seeing more of that sort of thinking every day.
Choices Matter
I think Storagebod put it best -- the general thinking here is a preference for solutions that are loosely-coupled with storage arrays, rather than ones that are tightly coupled.
This isn't for any real technical issue per se, it's just the unpleasant thought of not being able to change your storage supplier without completely revisiting your overall approach to backup and data protection.
Besides technology coupling (loosely or otherwise), there's the more obvious need to put your recovery copies on something other than the physical device where the primary copies live. Yes, this stuff is very reliable, but things fail: hardware, software, people, data centers, etc. -- and it's generally thought of as bad juju to keep your primary *and* all of your backups on the same physical device.
Let's face it -- you wouldn't consider copies of your files on the same laptop drive a serious "backup", would you? This means you'll usually acquiring a storage device for the purpose -- ideally, built for purpose as well.
The Bottom LineNo argument -- our primary mechanism for protecting data is rapidly shifting away from tape, and towards random-access storage media. That much we all can agree on!
But, behind that, there are key decisions that everyone will need to make:
How much process change are you up for?
How will you orchestrate and manage the business of protecting information, regardless of the underlying tools?
How will you re-think the boundaries between classic backup-oriented data protection, and the adjacent disciplines of business continuity and archiving?
And where will you prefer to retain choices, and where will you prefer tight integration?
I don't think anyone has pat answers to any of this -- which is why EMC has invested in so many related technologies in this arena -- but it should be an interesting discussion going forward!
It wasn't one of your competitors that started this discussion, Chuck, it was me on my blog:
http://www.backupcentral.com/content/view/299/47/
And I wasn't pushing a competitor's snaps & replication; I was pushing the concept that the two together can be an effective backup and recovery system -- one that is never "backed up" in the traditional form.
Now let's talk about what you wrote.
"Whether we call these snaps, clones, remote replicas, continuous data protection, or even the underlying dedupe capabilities of DataDomain and Avamar – you can see that line of thinking across our portfolio.
So far, we're all in agreement."
I can understand your confusion if you got your information second hand, but I wasn't just talking about backup to disk here. I was specifically talking about snapshots & replication. That does not include Data Domain and Avamar. While both are fine products, they both still think of backups in the traditional sense. That is, that the "backup" needs to be in a different format than the original. The whole point of my post was that I no longer think this is necessary.
Not Everyone Is Up For Process Change
I totally agree with that. I made the post only for those that are considering mass changes to their backup system. I'm saying, "as long as you are going to spend a bunch of money, perhaps you could consider this completely different way." It's what I do.
Orchestration Matters
I couldn't agree more. All these snapshots have to be configured, managed, reported on, etc. I touched on that in my blog.
The Continuum Matters
Not sure I understood the point you were trying to make here. Of course you should consider BC, DR, and archive when making such decisions. But I'm not sure what that has to do with whether or not using snapshots & replication as your backup system is a good idea.
Choices Matter
You said that "the general thinking here is a preference for solutions that are loosely-coupled with storage arrays, rather than ones that are tightly coupled."
I do think that is the Convential Wisdom, but I'm revisiting that idea. I'm wondering if perhaps a tightly coupled backup and recovery story would actually have the most value and cost the least to operate. For example, the whole point of dedupe goes away if you're not making duplicate copies of your data.
BTW, I'm really surprised it's a point that you're making. Many of the data protection options from EMC are very tightly coupled with your storage.
You also talked about "the more obvious need to put your recovery copies on something other than the physical device where the primary copies live"
Well, of course, and I addressed that in the blog. That's why I didn't just say snaps. I said snaps and replication.
Posted by: W. Curtis Preston | February 10, 2010 at 04:16 PM
Curtis -- I think there's much that we agree on.
As far as your "all in one" vs. "specialized for role" discussion, I think we'll see one preferred for smaller environments, and more specialized kit for larger ones.
Thanks for the thoughtful comment ...
-- Chuck
Posted by: Chuck Hollis | February 10, 2010 at 05:30 PM
Chuck,
Awhile back I posted a reply to you that suggested that performing backup operations that pull the data from primary storage to the backup server to be stored on some other storage media is the legacy process I for one would LOVE to avoid. Appologies to Curtis, but I think this is an obsolete view (with de-dupe or without) of the best practices for backup. It is one that we are stuck with, because 'the we have always done it this way' mentality is preventing us from doing something better.
Why can't my primary storage container have the built in version control for data that it is receiving from the application hosts and have the intelligence to retain the exact number of copies for the exact amount of time that a user would provide via a policy? I think your customers will break down the doors to your sales staff office(s) to get this capability.
Posted by: Gene Piatigorski | February 12, 2010 at 01:50 PM
Hi Gene
The inconvenient truth is that storing recovery copies (of any sort) on the primary storage device increases risk of data loss to unacceptable levels for many use cases.
Hardware fails, software fails, people fail, data centers fail, etc. Distance is good, more distance is better.
As far as your thoughts around "exact number of copies", etc. what you're describing is pretty close to today's CDP. Are you thinking of something different?
-- Chuck
Posted by: Chuck Hollis | February 12, 2010 at 01:57 PM
Chuck,
I think you are confusing BC with DR, and after you posted such a great explanation of the continuum from BC to DR in your original post. Daily backups don't necessarily need to go to a completely separate device. these backups are mainly used for recovery of accidental deletions, corrupt data, etc. Those backups can live on the same device in the form of snaps, clones, etc. just fine. If you also then implement DR, and replicate that data to another site, you are now protected against the issues you describe around array software, hardware, and process/people failure as well as site failure. This leads to some discussion about what a "disaster" really is. Most storage vendors like to use the "smoking hole where your data center used to be" scenario to describe a disaster. But disasters come in much smaller packages. I've had entire arrays go down due to both hardware (disk) problems, as well as software/firmware issues. That's a disaster, and precipitated a failover to the alternate site for all of the applications using those arrays.
the bottom line is that for BC, or standard backups, snaps works just fine. For a disaster, some form of replication is going to be necessary no matter how you implement the DR plan.
--joerg
Posted by: Joerg Hallbauer | February 24, 2010 at 06:19 PM
Joerg
You're right, I should have been more precise, and -- most of all -- I of all people should know better!
Thanks for the clarification.
-- Chuck
Posted by: Chuck Hollis | February 24, 2010 at 07:03 PM
A lot of business that I have been working with lately are looking for off site incremental solutions that support file versioning. They no longer want to spend hours sorting through tapes/hard drives looking through compressed folders for the right file(s).
As to your comment on "Not everyone is up for process change"... I can't agree with you more! This is especially true in large fortune 500 companies as even the slightest change in process can cost millions just in communication efforts to inform employees and to establish new documentation
Posted by: Jim | May 30, 2010 at 09:54 AM
Not to offer the shameless product plug, but there's been good takeup from many individual business users of the MozyPro offering for just that reason.
I've been running it for over a year now, and -- occasionally -- I need to get a versioned file back. It works as advertised for what it was intended to do.
Thanks for the comment --
-- Chuck
Posted by: Chuck Hollis | May 30, 2010 at 10:46 AM