At one time, almost all storage was directly attached to servers. Whether that was through disk drives in the server enclosure itself, or external racks directly connected, there was a clear 1:1 mapping between server, application and dedicated storage resources.
Over the last twenty years, the pendulum has swung in the opposite direction: intelligent external storage arrays that share resources across multiple applications and servers. I can still remember back in 1995 trying to convince people that shared storage was better than the dedicated kind, and it certainly wasn't easy :)
But now, there are clear signs that the pendulum has started to move back: intelligent shared storage, but without the familiar external storage array.
And I've now been asked more than once -- are servers the new storage?
The idea of using familiar, commodity-based servers to provide shared storage services has repeatedly become popular -- and then faded -- but it's not the same this time -- the motivations are very different.
Why Would You Want To Use Servers As Storage?
Take any tour of the storage array marketplace, and you'll find many designs that are achingly familiar: two or more server nodes with low-latency connections and intelligent software that turns commodity hardware into a useful storage array.
For example, at the recent VMworld there were perhaps 20 different storage array vendors, and my informal survey had most offerings using this approach.
But it's important to note that the model hasn't changed: storage array vendors still sell and support a box, it's just made of familiar components.
What about an entirely different model -- one where customers source their own hardware, and simply install software to transform servers into shared storage?
I think there are four powerful forces that virtually guarantee we'll see much more of this before long.
The Performance Argument
Storage performance these days boils down to flash -- pure and simple. You pay the extra money to get great gobs of speed.
Put a small amount of flash in an external hybrid array, and it's a significant performance boost that makes all applications run faster -- depending on the effectiveness of the algorithms being used. Go farther and create an all-flash external array, and you get predictably stellar performance for all applications that use it.
But put that same flash on a server bus, and it gets much faster and cheaper than any external array. Less latency, no need for a dedicated enclosure. Move that same flash technology (or its successor, like PCM) to the motherboard, and it gets even faster and cheaper.
Flash wants to be as close to the CPU as possible, for all the right reasons. Like many, I see Cisco's recent acquisition of Whiptail through this lens.
Storage software is needed to make all of this work, of course. Almost all of the software solutions today use server flash for cache, and don't try and deal with the complexities associated with persistent, resilient shared storage. But that will change before long.
Bottom line: flash economics will strongly favor a server-resident approach over time. And that means that technology decisions and implementations will be mostly owned by server teams, and not storage teams.
The Convergence Argument
When well-established technology categories collapse into a single entity, we call that convergence. Convergence is generally a good thing for IT customers: fewer moving pieces, integrated workflows, etc.
As an example, consider VMware's VSAN, now in beta.
The hardware is most certainly converged -- there's no need for dedicated storage hardware or resources, it's the exact same stuff you use for compute. But -- more importantly -- the operational model is converged as well. Storage is now a simple extension of what VMware administrators do day-in and day-out. No real need for a dedicated storage team.
The Pooling Argument
Today, we generally have one pool of resources for compute, and a separate pool of resources for storage. What if they were the same pool? The bigger the pool, the better you can do at optimizing resources. Look inside all those dedicated storage arrays, and you'll usually find a boatload of dedicated compute, memory and network ports.
The ability to consider both server resources and storage resources as one, shared compatible pool -- well, that's an attractive proposition from both an economic and operational perspective.
The Simplicity Argument
Anyone who's worked in a multi-vendor environment realizes that -- from an operational perspective -- the less diversity, the better.
Being able to use the same set of building blocks for both your server farm as well as your storage farm -- well, all things being equal -- that's a win. Fewer vendors to deal with, greater buying power, standardized tools and operational processes, etc.
The Counter Arguments
Where this line of thinking tends to break down is when considering very large amounts of storage.
Data volumes in the data center continually to outstrip media density advances -- even with efficiency techniques like dedupe -- so it's a safe bet that we'll see many more spinning disks in the future.
Simply put, standard server designs aren't optimized to house very large quantities of spinning disks. They're usually designed for efficient compute, with internal storage as a far lower priority. The density isn't there, the RAS isn't there, the power and cooling isn't there, and so on. As a result, they can be very, very inefficient at scale.
As evidence, consider the latest wave of new mid-tier storage arrays, where it's commonplace to rack 1000 or more drives behind a pair of storage controllers, all in a maximally dense enclosure. Try doing that using commodity servers for storage, and you'll usually end up with a sub-optimal solution.
It seems pretty clear to me that -- at scale -- we'll see future storage architectures fall into two, distinct categories, each optimized for purpose.
For performance, we'll see server-resident flash, augmented by low-latency interconnects and smart storage software. That's where you're going to get your best bang-for-the-buck with IOPS. For capacity, we'll see vast, scale-out farms of purpose-built storage controllers (albeit with familiar components) that deliver amazing capacity and bandwidth as needed.
Presumably, the two domains would closely communicate: moving data back and forth using a variety of services and semantics.
Are Servers The New Storage?
At decent scale, the answer appears to be "yes", especially for the performance tier, driven by the need for cost-effective flash very close to the application.
At more modest scale, though, the answer is more "I'm optimistic". The arguments are powerful, but you never know for sure until potential customers check in. Personally, I'm watching to see how people react to the VSAN beta -- as it puts into play all of the core arguments I listed above.
One thing is for sure -- there will always be more data to deal with, which means we'll always be looking for better ways to deal with it.
Like this post? Why not subscribe via email?