Last Thursday, EMC announced the acquisition of ScaleIO, a small Israeli storage startup.
While many news sources reported the facts of the transaction, painfully few went deeper to try and explore what might be really going on. Yes, Howard Marks wrote a nice piece, as did Chris Mellor -- but that was about it.
I suppose that's understandable. After all, EMC does a lot of acquisitions, and there wasn't a lot of money that changed hands.
But if you live and breathe in the storage world, it's a rather intriguing acquisition for one simple reason: it's a big bet on an very different storage model.
And you might have thought that would raise a few eyebrows.
ScaleIO's website tells the story simply. Their single product (Elastic Converged Storage, or ECS) is somewhat reminiscent of Amazon's EBS -- Elastic Block Service.
Big pools of commodity servers can share their internal storage to create a block storage model that's performant, resilient, manageable, etc -- with many of the attributes of familiar dedicated storage arrays.
Many of their early customers are IT service providers that need to offer a block service similar to Amazon's, and the ScaleIO software product fit the bill. Service providers could easily stand up a managed, metered service using familiar components, and that was that.
But Wait A Second
If you read the EMC announcement materials, it's clearly all about flash -- server-resident flash to be clear. The acquisition was sponsored by EMC's Flash Products Division which sells the Xtrem line of flash products: the XtremIO all-flash array, the XtremSF server cards and XtremSW cache management software for those cards.
Where's the connection?
There's a strong one, albeit subtle -- one that's worth explaining.
Flash Changes Everything
We've had enterprise-grade flash technology around since 2008, and we've learned a few things since then.
* it's addictively fast stuff -- once you put an app on it, you never want to go back.
* it can be made as resilient (or more resilient) than spinning drives
* since it is a semiconductor, it evolves along the lines of Moore's Law vs. the more linear evolution of magnetic media.
* as a result, prices continue to drop fast, and capacities continue to soar.
Everyone who works with flash knows that the industry is moving fast in that direction — it's sort of a given. If performance matters, you'll be using flash in some form -- if you're not already.
Understanding Storage Latency
With flash, the performance levels are so high, all of the sudden latency becomes an interesting discussion. A quick explanation: latency is the overhead you pay to communicate with a storage device.
In a world where disk drives had 5 millisecond response times, you didn't worry much about controller overhead, network overhead, etc. -- the mechanical disk drive was the bottleneck.
Now we live in a world where flash storage media can respond in a handful of microseconds -- hundreds of times faster. All of the sudden that overhead to get to the storage media becomes very meaningful.
If you put that uber-fast flash storage media in a traditional storage array, you'll have to traverse the server's storage controller, the storage network, the array's controller and back again. It ends up being significant, no matter how fast the components.
If you put that same flash storage media in the form of a flash drive inside the server, you still have to deal with the server's storage controller, but that's about it -- much, much faster, as there's no storage network and no array controller in the loop.
Going farther, mount that same flash media on a PCI-e card, and it's faster yet: no storage controller involved, just the overhead of a bus transfer. Go even farther and put that flash media directly on the motherboard in copious quantities, and you've got an even faster solution.
You can see what's going on here: flash very much wants to be inside the server for mostly performance reasons. But it's not as easy as it looks.
What Makes Server Side Storage Hard
So let's say you took a bunch of servers, loaded them up with flash cards, and told your applications to use them just like disks -- and did so without the benefit of a storage software layer such as ScaleIO's ECS. What kinds of issues would you encounter?
The first issue would be resiliency. Cards fail, servers fail, etc. In this model, there's no way to get to the data if either the flash card or the server that houses it fails for some reason. You need some mechanism to spread the data redundantly across multiple nodes.
The second issue would be pooling and sharing. You'd want an application to get at all the storage, for both capacity and performance reasons -- and not just the tiny pool sitting inside a single server enclosure.
Conceptually, not that hard -- but there's a constraint: the network. Most servers are lashed together with 10Gb ethernet running TCP/IP, and that introduces all sorts of interesting design concerns: latency, congestion, etc.
So our third concern would be performance: preserving the inherent performance of flash media across multiple server nodes.
If we were building our server-side storage farm only using disk drives, we'd be dealing with millisecond latencies. In a flash world, we're in the microsecond world, so these issues start to really matter. You'd want to make sure your storage software was very, very smart about using the network intelligently: minimizing traffic, etc.
And There's More
We've come to expect much more from storage arrays than simply reliable reads and writes.
Powerful snap and replication capabilities, for example. Drive-level encryption. Tools to manage configuration and QoS delivery. Easy consumption and metering portals. Flexible media choices. The list is quite extensive.
We've all come to simply expect those things as part of our familiar array world, and we also expect them in any new model, like server-side storage.
Back To ScaleIO
There have been no detailed public discussions from EMC around what's intended for the ScaleIO technology. I would expect that to come later.
But there are a couple of assumptions that I believe are pretty safe to make.
First, the ScaleIO technology will likely become an important differentiator in EMC's server-side flash strategy -- going far beyond the capabilities of the current XtremSW and similar.
Second, the technology in an EMC context could get pretty interesting, pretty quickly. Lots of interesting storage software in the EMC toy box, plus a great ecosystem of technology partners (e.g. VMware, Cisco, etc.) coupled with a very strong go-to-market.
Third, there is strong historical evidence that EMC invests strongly in potentially disruptive technologies and surf the waves as they come. One could make an argument that uber-fast storage in the form of commodity servers, flash cards and intelligent software is potentially *very* disruptive.
Fourth, if we consider software-defined storage, this is a significant component in a broader EMC strategy: ViPR tackles the management and orchestration duties (as well as a few neat data presentation tricks) -- and now we have yet another software-only storage target -- ScaleIO now joins Atmos as a pure software-based storage "array" with few dependencies on using commodity hardware.
Chad Sakac has a similar -- but different -- perspective on all of this. Good reading.
Storage Continues To Fascinate Me
A few years ago, it was fashionable to dismiss storage as boring, stodgy, etc. No longer.
We're quickly moving to an information economy. Storage becomes a very interesting place to invest as a result. If you follow the venture capital scene, you can quickly lose count of all the smaller storage startups that are out there, including some bigger names as well.
EMC continues to do well in the resulting technology continuum. We've got proven, mature storage platforms that enterprises trust day in and day out -- VMAX, VNX, etc. We've got excellent examples of where we've taken a new technology and made it a de-facto standard (e.g. DataDomain). And we've got plenty of examples of new, disruptive technology we're working hard on to make new de-facto standards (ViPR, XtremIO and now ScaleIO).
In today's world, there is no simplistic "best storage" for everyone, just as there is no best computer, no best car, etc. It's usually a pragmatic choice, and not a religious or fashion statement. As a result, most enterprises use multiple storage solutions to get the job done. I would argue that storage diversity in the enterprise is increasing over time, as you might expect.
And with EMC's ScaleIO acquisition, I'm looking forward to having more interesting choices to consider.
Like this post? Why not subscribe via email?