We all know that -- at some point -- we'll all need to get good at encrypting data-at-rest, at least for parts of our environment.
Yes, we're all familiar with the tape-off-the-truck story, but we all kind of suspect that this is just the beginning. Sooner or later, we'll need to do this for disks as well.
One of the questions that comes up frequently in customer discussions is -- when it comes to storage, where will be the best place to do encryption?
I thought I'd offer up a few thoughts on where I see the interesting activity happening.
Let's start with a few simple observations
The first observation I'd like to offer is that -- today -- there's only one kind of mainstream option for disk storage encryption, and that's an encryption appliance -- think NeoScale or Decru. But no one (except for their marketing departments) is entirely happy with the idea of lots of different, expensive appliances lying around the IT department encrypting data.
This kind of approach (at scale) is inherently expensive, complex and -- of course -- doesn't make things run any faster. But, if you've got a mandate to encrypt disk or tape, this is your option today.
And, of course, we're seeing the first encrypting tape drives come on the market from the usual suspects. I think a lot of us are still figuring out if this is a viable approach, but it's out there.
The other observation that I think is important to make is "compress first, then encrypt". Encryption (if it's good!) randomizes data. That means your tape compression or network compression algorithms are completely and utterly defeated if the data is encrypted first. I don't think the data de-dupe vendors will be bragging about compression ratios for encrypted data either.
And, most of us know that encryption without some sort of hardware assist is slow. Maybe real slow, depending. Hardware assists (e.g. ASICs) are coming along, but they won't be available everywhere they're needed.
The most important observation, I think is that the difference between data encryption and data destruction is the loss of a single key. It's a sobering thought.
Lose an encryption key, and you will be having a very bad day. You'd update your resume, but you can't get to it, because, well ...
Most of us believe that the real activity in the future will be around key management in the storage domain, rather than pure encryption, which will tend to be commodotized over time, but I'm going to save this topic for a future post.
The use cases that show up for key management in the storage domain are unique and fascinating. They're also not obvious to most people. I'm looking forward to all the head-scratching presentations we'll all be seeing in the next few years as the industry tries to digest the key management topic.
So what will the choices be?
Architecturally, the choices are pretty simple to understand: encrypt in the application, encrypt in the HBA or I/O stack, encrypt in the network, or encrypt in the device. I'd like to take a time and look at each of these, and the likely use cases that will develop around them.
Before I start, though, I have to give you the classic answer: It Depends. What I mean is that -- like virtualization -- we'll likely see good encryption solutions in all of these areas, each with different strengths and weaknesses.
For those of you looking for The Answer, I'm sorry once again.
Application Encryption
This approach is reasonably available -- it's supported by many databases, some of the newer file systems, backup applications (like EMC's NetWorker), content management repositories (e.g. Documentum) and so on.
Simply put, most people don't like this approach because it's s-l-o-w. There are key management challenges as well, but they're not overly complex.
Unlike other approaches, it won't be a candidate for hardware acceleration anytime soon, unless you're running on an IBM mainframe and using their coprocessor.
Some of the merchant chip vendors (Intel, AMD) are talking about encryption co-processors down the road, but we're talking years before the hardware is available, and many more years until the operating system and the applications have been modified to take advantage of it.
HBA or I/O Stack
The next logical place to do storage encryption is at the SAN endpoint -- where it connects to the server. Many of the HBA (host bus adaptor) vendors are looking at putting coprocessors or ASICs on their products, which -- over time -- offers the potential for accelerated encryption at decent speeds and costs.
I think we'll see this before too long -- maybe the first products in 2007. Of course, they'll all need new drivers, and we'll have to qualify them in all sorts of permutations and combinations. And there will still be an interesting key management issue.
I think this kind of approach will be attractive to someone who has a few servers that are running sensitive applications and wants to encrypt all the data behind them. But, if you're looking at upgrading several hundred servers (remember, HA environments require TWO adaptors per server), you'll kind of gag at the cost and the effort.
Let's also not forget that any sort of compression downstream (replication, tape, data-dedupe, single instancing) will be complete defeated by compressed data. All of this not withstanding, I think this will be useful for many people who have just a few servers that need to encrypt their data.
Intelligent Network Encryption
As part of the move to intelligent storage networks (virtualization, replication, et. al.), Cisco and Brocade/McData are including powerful ASICs that could -- potentially -- provide high-speed encryption as a SAN service.
I think this sort of approach could be far more popular than the ones above, for some simple reasons.
First, data encryption will be a selectable service. If implemented correctly, you could easily choose which paths are encrypted, and which ones aren't. Nice.
Second, although the initial costs will seem to be higher, at scale they should be cheaper than trying to upgrade every HBA you own.
Third, when we look at key management, I'd think it'd be easier to coordinate at the network level than trying to manage every endpoint on the network.
And, finally, since some of these devices also do compression for remote replication (compress before encrypt), at least you wouldn't have to worry about that piece of it.
Device or Array Level Encryption
Maybe you didn't notice, but Seagate has been talking about on-drive encryption for a while. Intriguing, but I don't think we really understand the cost and performance implications of this yet. And my head hurts just thinking about how you'd coordinate key management across potentially thousands of drives, replicas, and so on. Ouch.
I saw that Fujitsu was starting to talk about their encrypted array. Nice, but the only thing it protects you against is someone walking off with a disk drive -- and I think there are far better ways to solve that problem without that sort of brute force approach. Interesting, the materials I saw didn't mention any sort of key management. Maybe it's positioned as a data shredder :-)
One area that array-level encryption could help in is the "compress before encrypt" scenario. If the array is doing data-dedupe, or array-level replication, this is helpful. Tape would require something else, of course.
Tape drives are logical places for encryption. I don't know if they've solved the "compress before encrypt" problem yet, but they have the potential to do so. Key management can still be problematic in this environment, though.
So, what does all of this mean?
My best guess today is that -- of all these approaches -- most customers will evolve to wanting this functionality in the storage network. They'll come to think of it as part of the SAN.
In some ways, it'll probably be the easiest to implement, probably the most cost-effective at scale, and -- eventually -- the easiest to manage.
I believe that most important forms of data-dedupe will eventually move to the server side (see EMC's acquisition of Avamar here), which means that device level (disk array or tape drive) will become less of an issue in this scenario.
Yes, you're getting 3:1 compression on tape, but in a world where you're getting 20:1, 300:1 or something similar, it's just less of an issue. And, besides, how many of us want to stick with tape all that much longer?
But in the meantime, we trudge forward with either application-level encryption, or selectively deploying appliances to meet today's needs.
I hope the industry can get to better solutions before too long. I just wonder how much of my own personal data is out there, unencrypted.

Comments