Saw an interesting news blurb from the folks at Incipient the other day, and I got to thinking about where storage virtualization had evolved to in terms of different use cases.
For those of you who haven't been following Incipient, they're out in the marketplace with a storage virtualization solution that bears an architectural resemblance to EMC's Invista, e.g. uses an intelligent switch as opposed to an appliance (IBM SVC) or the array itself (HDS's USP-V)
A Bit Of Perspective
I have always thought the storage virtualization thing was a bit overhyped in the industry, even from its earliest days, simply because it wasn't always exactly clear what problem the technology was trying to solve.
Is the problem poor utilization? SANs and decent storage management address that for most people with no need for storage virtualization.
Is the goal to reduce the cost of storage itself, by presumably using dumb storage and smart, virtualizing controllers? I was always dubious about the economics of this one (as compared to something with a decent amount of intelligence and capacity, such as a CLARiiON), and it seems that even the most vociferous proponents have backed away from this one.
Is the goal to get a certain level of storage functionality (e.g. snaps, replication, etc.) without being tied to a specific brand of storage? Products like EMC's RecoverPoint and others do that without the need for a separate virtualization layer.
Is the goal to make storage easier to manage? You'll never do as well as simply implementing a popular SRM (storage resource management) suite, such as ControlCenter. Storage virtualization makes a poor choice for this as well, as you don't get the server view of storage, and -- well you're creating yet another abstraction to coordinate.
Which Gets Us To Data Migration
Many of us at EMC have always thought that the best use case for storage virtualization might be large-scale data migration. And, to be sure, we've sold many Invistas to ginormous shops that seem to be always migrating stuff around to optimize placement, refresh assets, and so on.
Even the folks at Incipient seem to agree -- they've split off the data migration facility from the virtualization product itself. As a matter of fact, similar things are available from Cisco and Brocade.
The idea is pretty simple: do a 1 to 1 virtualization of a LUN group, and then use something to move the data from the old to the new source -- over the SAN -- and not really interfere with the application too much.
But, even then, there are alternatives.
PowerPath Migration Enabler -- Unsung Hero?
There's just too darn much in the EMC portfolio that sometimes there are cool bits that escape detection.
PowerPath is an OS-neutral MPIO layer that does all sorts of neat things: path redundancy, I/O balancing, and so on. There is a LOT of it out there.
Recently, it picked up the ability to do encryption (working with the RSA Key Manager), and -- for a while -- it's been able to do the same sort of virtual LUN headfake-and-move trick (known as PowerPath Migration Enabler, or PPME), except at the server, rather than the switch.
Since PPME simply plugs into the existing I/O stack (usually running PowerPath already), it's a pretty slick solution if you don't want to think in terms of an intelligent switch, downtime to get it all implemented, etc.
Just like the solutions from Incipient et. al. it does a 1:1 virtualization of the LUNs, and uses either an external data mover (e.g. SAN Copy, SRDF, et. al.) to actually move the data; PPME being responsible for managing the process, maintaining updates on both sides during the move, allowing a roll-back process, etc.
Move Vs. Improve
To hear talk of these solutions, it makes it sound pretty mechanical -- simply plug it in, and the data trickles across. But -- in the real world -- there's usually a lot of thought given to where the data is going: how the LUNs are laid out for performance, availability and future growth, tiering, and more.
A decent amount of planning work by smart people goes into many of these larger migrations (or should!!!) and no technology -- from EMC or anyone else -- is going to automate that part of the process.
I guess I'm not the only one with this point of view ...
Where Does That Leave Storage Virtualization?
Storage virtualization has the potential to do lots of interesting things -- but what is it the best at? As I see it, the hardcore vendor fanboys will tick off a long list of things it could do, but are challenged to make a case that it's the best alternative, and really worth the trouble.
Want to consolidate and pool? Many ways to do that, and in many cases the alternatives may leave you better off than storage virtualization.
Want cheaper storage? No evidence that you'll do better that way than with purpose-built arrays.
Want to manage your environment more effectively? Storage resource management software is probably a better path.
Want standard replication functionality across multiple storage vendors? Products like RecoverPoint probably do a better job than storage virtualization products.
And now -- when we look at data migration -- storage virtualization will have to compete against alternatives that either run in the server (e.g. PowerPath Migration Enabler) or perhaps the array (DMX Data Migrator, CLARiiON SAN Copy et. al.).
And, just to be clear, a storage virtualization device that uses intelligent switch technology (such as Invista) probably brings more scale and flexibility to the game than server-based, storage-based or appliance-based alternatives. But you might not need all of that to solve the problem at hand.
Another Thing To Think About
No matter how any vendor positions their flavor of storage virtualization -- including EMC -- there's simply no arguing that it's another layer in the stack that exacts a cost: equipment, performance, availability, complexity, and so on.
No free lunch here for anyone, even if you got your storage virtualization for "free".
You can minimize these externalities, but you can't completely eliminate them. And every vendor will be challenged to make the case that the benefits outweigh the downside.
Where Do We Go From Here?
I'm not sure.
Storage virtualization products have been in the marketplace in one form or another for over 5 years. That's more than enough time for the technology to mature, for customers to understand the pros and cons, and so on.
And, I have to say, if this storage virtualization stuff was going to take the world by storm, we'd certainly see evidence of that by now.
Now, I'm discounting all the virtualization software that HDS bundles with their arrays, or all the SVCs that get "packaged" as part of another transaction.
My perspective? When I get in front of an IT group (usually an EMC customer, of course), storage virtualization is not usually a big topic with them.
Why? They've got bigger fish to fry ...

Chuck, a counter view.
Utilisation. You still can't get over the fact that you create a volume and assign it to a host. What size do you create and from which controller? Is it already mapped to that host? Does it have free space? In a disjointed (not everyone wants to buy DMX for all their storage) infrastructure you will always have to make these decisions. With virtualization, everything is already mapped and the pooled available resources from a given Tier mean you do get much better utilisation than non-virtual. Our customers tell us from about 20% pre-SVC to over 75% after. Thats more than 50%.
Cost. Well this one I just disagree with completely - and again most of our customers and a lot of comments on my blog are asking for a fast performing, low-function RAID brick to put behind SVC. They are seeing the virtues of a single point of control, and with 4.3.0 now providing SEV and of course Vdisk Mirroring (VDM) you can even make two copies on cheap stuff to protect even further. So with a RAID-1 like function, over SVC's standard striping, gives a pseudo RAID-10, combined with RAID-5 or RAID-10 on the backend...
As for management, products like IBM SSPC provide the complete host through virtual storage image, so just because CC doesn't...
Yes there are alternatives to all of these, but they have downsides you haven't talked about. Like needing other hardware (Kashya), very expensive software (PowerPoint - SVC with all its hardware, software, licensing, advanced function features, thin provisioning, built in migration etc etc can be purchased, installed and licensed for less than an average medium sized customer needs to pay for PowerPath, nevermind thin provisioning, and I'd guess the migration doesn't come for free either...) and what processor cycles are used to do this - the host?
With any of the flavours of virtualization all these functions are offloaded on hardware away from the host.
As with everything there are two sides to every discussion, and I just wanted to point of the other one!
PS. Care to tell the world any of those very large customers that are using Invista? I've still yet to hear of a single customer other than the one reference quote last year that is using Invista in anger in real life production. I've asked and asked, but nobody seems to want to speak up :)
Cheers, Barry
Posted by: Barry Whyte | June 14, 2008 at 09:44 AM
I knew I'd get at least one fanboy to bite ...
So, point by point.
Your first point is regarding storage utilization and LUN carving.
As we both know, a virtualization device isn't required to do this; many modern arrays (e.g. CLARiiON, EVA, NetApp) support this feature -- and are quite large. Also many customer elect to LUN carve using a volume manager.
It's not enough to claim a feature; you'd have to demonstrate that yours is a superior approach for a reasonable number of cases.
You claim that implementing SVC raises storage utilization. No doubt it does. As does ordinary storage consolidation, or using SRM tools, or just having someone take a look at your environment and sort it out a bit.
Once again, you claim a generic benefit, and don't share how your approach might be better for customers than others, and justify the additional cost and complexity.
As you well know, many modern arrays support tiering in-the-box. As an example a large CLARiiON can support 480 drives from fastest to smallest, metaLUNs, you name it. And big ones support up to 480 drives, which is considerable.
All of it mapped, tiered, managed, ready-to-go just as you suggest.
And no need for a virtualization dongle, multiple points of control and management, etc.
Again, it's not enough to simply claim a feature, you've got to show that you do it better than other alternatives for a sufficiently broad use case.
Cost? Feel free to disagree. We've run the numbers over, and over, and they come up the same. With your approach, there are redundant enclosures, power supplies, I/O hardware, etc. if you're buying new.
Recycling old storage? You might have an advantage. But I'd put that up against selling the old storage in the aftermarket, and getting a modern storage array that meets your needs.
Perhaps the reason your customers are looking for good, low-cost RAID bricks is that they can't make the numbers work as compared to a purpose-built array? Or they don't like what's in the IBM portfolio?
Why don't we put list prices up?
We'll put up, say, 300TB usable, and you put up 300TB usable (using published list prices), and we'll see how it plays out.
I'll even send you pricing for the new AX4s which are pretty aggressive, if there's nothing in the IBM portfolio that'll do ;-)
You claim a neat feature with RAID 10. No big deal, many arrays (including ours) do the same thing. But do you do it better? Better enough to justify your premium?
Doubtful.
As far as your claim as requiring extra software and/or hardware to perform an extra function -- well, duh! -- that's usually the case. That, of course, would include buying an SVC, wouldn't it?
And, until you work the numbers for a given customer, don't be sure that your approach is always cheaper.
As long as we're talking about external costs, maybe you could share some of the notable downsides with SVC on your blog? Like maybe the ones we hear about all the time from some of your customers? Increased complexity? Extended support exercises?
As far as publishing references, you and I both know that's more up to the customer's wishes than it is ours. The larger the customer, the more reluctant they are to speak in public.
Since SVC tends to target smaller installations and smaller deployments, I'd think you'd have no real problem coming up with smaller shops that are willing to see their names in press.
Good for you.
By comparison, most of our Invista deployments are fairly substantial -- and thus they're extremely reluctant to be part of the vendor PR machine.
I"ll take it.
Now that we're done sparring, I think you missed the major point of the blog post -- storage virtualization has failed to catch on as a major factor in our industry because (a) its mission was unclear, and (b) many viable alternatives exist for the functions it does.
That's ALL storage virtualization approaches: ours, yours, everyones!
I think that IBM is in a unique situation because it's one of the very few IBM storage products that IBM builds itself, and is a decent product -- thanks to you and your team!
Other storage vendors (such as EMC) are in an entirely different position when it comes to having an expanded toolbox to solve customers' problems -- we don't have to be as religious regarding storage virtualization as our IBM counterparts.
I think the peak has come and gone, and it's time to move on to more productive discussions about how best to solve customer problems.
Storage virtualization has had its chance already, hasn't it?
And -- ultimately -- that's what it's all about -- solving customer problems the best possible way, and not being limited to single solution.
Cheers!
Posted by: Chuck Hollis | June 14, 2008 at 11:25 AM
So I guess thats where we fundamentally disagree. I think its just taken longer to get going, as its a new way of thinking about how you deploy your storage. But what we see is a dramatic, increase in interest for SVC year on year. Its not the only solution to all of our mutual customers problems, but it goes a long way to help.
Anyway, we could go on and on disagreeing - but one thing I think you missed is the ability to do RAID-10 like function across distinct controller hardware, thus a local HA solution, that takes the 'thinking' away from any LVM or the like. RAID in itself is a must have from the backend, but providing that extra level of complete controller loss was a major request from a great number of a largest customer (many of whom were 100% EMC before SVC came along).
I guess the one thing we didn't cover was choice. Choice to buy storage from any vendor you like and not be locked into a single vendor for all your data storage needs. It seems to be a big plus for a lot of customers that want to maybe move away from the model of a DMX for everything.
Anyway, time will tell, I guess I read what you are trying to get across is that EMC won't be investing in Invista much more and you won't be putting USP like virtualization into DMX.
We do agree on the one thing though, solving our customers problems is what its all about.
Fair enough.
Posted by: Barry Whyte | June 14, 2008 at 12:22 PM
Mirroring across multiple (mirrored) controllers, without the need for having that second controller at a distance for DR? OK, I'll grant you that -- it'd be a toss-up between doing it at the host (for free), or installing a dedicated virtualization appliance.
Or perhaps doing real DR at a remote site.
But I'd argue not too many people fall into that category, and would spring for an SVC on that basis. And I'll let you know when I meet someone who really wants that.
Choice? I think that cuts both ways -- choose your poison: standardize on arrays, or standardize on virtualization controllers. Are you telling me that SVC mixes and matches with other appliance-based storage virtualization devices? Does that mean that I can virtualize with USP at one level, and then SVC at another? Or do USP and SVC behave as peers in the same fabric?
I didn't think so.
Your inferences on EMC's future product direction regarding Invista, DMX, CLARiiON, etc. are not entirely correct. I'm not in a position to correct you, but you're just fishing here, aren't you?
Best of luck, Barry!
Posted by: Chuck Hollis | June 14, 2008 at 12:37 PM