Two apologies before we get started here.
First, I'm sorry if this post comes across a bit nerdy and focused -- but I think it's illustrative of some powerful forces that are playing out in the IT infrastructure business, so it's a good case study.
Second, I will beg forgiveness if I fall into a bad case of the "I told you so's". My purpose is not to brag about guessing right, but to paint a picture that when things don't add up, there's a reason why.
Today's post is an update on the latest continuing chapter in storage virtualization. For those of you who may be casual visitors to the incredibly arcane World of Storage, this has been a hot topic for at least five years.
I would argue that more time has been spent arguing about it (by customers and vendors alike!) than useful value created, which is why this topic is so fascinating, at least to me!
The evolving virtualization pitch
The First Pitch (way back when) was that you could save money on storage by virtualizing it: mix and match different storage types in a single, efficient pool using virtualization. Your servers would see physical LUNs, but these would be logically spread out across whatever storage you needed.
If I remember right, Compaq was the first to aggressively propose this idea. The implementation left a bit to be desired, but it involved a customized HBA (host bus adaptor) and created dependencies in the operating system, drivers, SAN, storage, etc. Not well thought out, but the idea stuck. I never ever heard of anyone who had got it to work, and I think Compaq (now HP) abandoned the idea once people figured out it was made of that special, rare element called Unobtainium.
The Second Pitch (going a few years back) went a bit further, and built on the first. Not only could you save money by consolidating and using cheap storage, but you could manage it more effectively by using the virtualization device as kind of a manager.
Just about everyone got into this game with appliances: small servers that sat in the data path, presenting virtual storage on one side, and supporting physical storage on the other.
IBM did one (SVC), HP did one, NetApp did one, Sun did one, lots of little players, and so on -- everyone but EMC did one. I'll explain why we didn't bite on this one in a bit, but it seemed that everyone but EMC had a Magic Virtualization Appliance (MVA) that saved money, worked with everything and made your storage life easier to manage.
And today, I think it's fair to say that this approach hasn't exactly set the world on fire.
So why was this?
The "save money" argument with appliance-based SAN virtualization
The "save money" argument didn't work out for too many people.
Storage from all vendors is getting cheaper, and -- at the end of the day -- disks are disks.
Second, if all you want to do is pool your storage, a decent SAN and a volume manager will work well enough.
Third, these boxes aren't cheap: they're beefy processors with lots of memory and IO cards -- so when do I get to start saving money?
And finally, most vendors (including EMC) have basic virtualization capabilities within their respective arrays -- and they don't need a separate widget to do it. And these arrays, as you know, can get pretty large.
Besides, there's only so long you want to use old storage in your environment before it's better to get faster/cheaper/bigger/more reliable stuff on the floor. 18GB drives, anyone?
Some vendors (you know who you are!) appeared to bundle their virtualization products as part of the deal, just to drive up the numbers in an effort to create credibility. It worked, to a certain extent, but -- sooner or later -- customers have to pay for what they use.
The "easier to manage" argument kind of backfired as well
I never understood this argument. To manage storage well, you need to coordinate the storage devices (each of them slightly different), the SAN (again, subtle difference), and -- of course -- the server view. To do storage management right, you need orchestration between all three.
Inserting another layer of abstraction (appliance-based virtualization) actually made the management problem WORSE, not better. Any SRM tool in the market broke when confronted with the storage virtualization device.
Users found themselves managing the storage, then managing the virtualization device, then managing the SAN, then managing the server -- well, you get the idea. None of them were aware of what the other was doing. Yech.
And then there were the side effects
I never knew a network that ran any faster with a bump in the wire. And that's what these appliances were; a bump in the wire. They had to terminate I/O at one end, and reinitiate it at the other end. OK, in some cases this didn't matter, but performance-sensitive applications were a concern.
And then there was the whole support question: in a multivendor environment, who is responsible for supporting the entire environment? Some customers would expect EMC to provide end-to-end support with an IBM SVC, or a NetApp thingie, or something they downloaded and tried out.
Sorry folks, we really aren't set up to do that kind of thing.
IBM appeared to bite the bullet and spent a pile of money building out a multivendor support capability, but that sort of stuff is expensive and has to be passed on to the consumer in some form, so it isn't making things any cheaper.
Some vendors started to offer replication in the virtualization appliance, but this caused additional problems: cycles spent on replicating data can't be used for passing through I/Os, so that's another performance concern. And the capacity/bandwidth (not to mention stability) of some of these relegated them to modest implementations at best.
So, IBM doesn't talk about SVC a whole lot anymore. NetApp has gone relatively silent on their "v" series. I read somewhere that Sun is trying to sell off its virtualization appliance. And no customer I've talked to is looking at these approaches seriously.
And then there was Hitachi's (somewhat unique) approach -- they used the back-end ports on their storage array to virtualize other storage vendors. Arguably better than an appliance; I definitely give them high points in the "use what you got" category.
The probably have a better chance at the performance /scalability aspect of this, but are still disadvantaged with management, support and -- oh yes -- cost. A big honkin' HDS array is a damn expensive way to get to saving money.
So why did EMC stay clear of all of this?
We thought all of this was the wrong architectural approach. Seriously -- we're not talking marketing hoo-hah here.
We believed -- as an article of faith -- that storage virtualization was a network function, and didn't really make sense in an array (Hitachi), an appliance (long list of folks here) or server I/O (Compaq). We just couldn't understand how -- long term -- it was going to work out in a substantially useful way for our customer any other way.
It would take more time, more money and more effort -- but we believed that this was the right answer, and we weren't really interested in building the wrong answer.
We also thought that the initial use case everyone was promoting was missing the point.
Forget the pooling and the cheap storage and the management for just a second -- we believed that the killer app in this space was non-disruptive data mobility: the ability to move a hammering online database from one array to another without any sort of downtime or significant performance hit.
Intelligent SANs
We saw that network ASICs were much faster and more powerful than anything from the server/appliance side of the shopping aisle. People had SANs already, so that addressed some of the cost issues. And, if you really wanted an open environment, a SAN model was the right way to go.
If people really wanted a scalable, cost-effective, open and manageable storage virtualization environment, intelligent SAN technology was really the only game in town.
We built Invista with Cisco as our lead partner. It took a long time to bring to market, and we had to do extra work to make sure it integrated well with our array functionality, and to make sure all the management pieces worked well enough, but we did it.
It's been out for a while. The product works as advertised. It has a modest but growing cadre of customer enthusiasts, and it continues to do more and do it better. It has an outstanding bug count of exactly zero, the last time I checked.
At the same time, the PR spin has been skeptical at best, thanks to the PR departments at our friendly competitors.
So, maybe we were right, but does it matter?
Yes.
Platforms vs Products
So, maybe you're thinking -- gee, is this the only reason I need an intelligent Cisco switch? What else can I do with it?
Don't know if you noticed, but EMC now has another product that exploits this intelligence for replication -- RecoverPoint -- and it's a humdinger. Sync, async, CDP -- does it all. Great, easy to use GUI. And, because it uses the power of the intelligent switch, it can be positioned as high performance replication -- no noticable performance degradation as it does its thing.
How many of you are thinking about buying another bump in the wire for encryption? Decru, or maybe NeoScale? What if the intelligent SAN device did this for you, transparently and at very high performance? You'd be interested, right?
What about high-speed backup and recovery, using the intelligent SAN as a high-speed data mover? Sound good? What about the ability to dynamically tier service levels on-the-fly? Lots of interesting things are possible, using the same basic core technologies.
My point is simple: because we toughed it out and did what we thought was right architecturally, not only did we end up with a better product (Invista), but we ended up with a platform for all sorts of useful functionality that ought to run in the network.
Imitation is flattery
So, I'm hearing rumours that IBM will be using the same approach for a new version of SVC in 2H 2007. I don't know if they're true or not, but I hear they'll be OEMing the control software from Incipient (we developed our own, BTW), and it should be fun to watch IBM storage marketing do a 180 from "intelligent switches are bad" to "intelligent switches are the future".
Whether they end up doing this or not really isn't entirely relevant. God bless'em if they've seen the light.
What is relevant is a couple of points:
-- time to market is nice, having products that really solve problems (rather than create new ones!) is better
-- radical, new ideas (e.g. storage virtualization way back when) can be easily marginalized by developments in adjacent areas (e.g. lower storage media costs, better SRM, array-based capabilities) UNLESS you've picked a relevant use case that can't be solved any other way (e.g. non-disruptive storage migration)
-- given the choice between a product and a platform; platforms create more value for customers
So, yes this is a nerdy discussion, and yes -- I told you so.

Comments