An excellent example of this "unlearning storage" was buried in an Atmos announcement from EMC World. In particular, Atmos storage now runs nicely in a VM. Not just for eval, for production.
And there are some interesting implications as a result.
I doubt myself these days, and I've been playing with it for over 20 years.
Let's make a short list of default assumptions that are up-for-grabs now.
Storage is spinning disk, right?
Not when you consider the advent of technologies like flash and FAST. Storage is becoming a semiconductor device, and less rotating rust.
Lots of spindles means lots of performance, right?
Nope, all the old-school performance tricks like spreading the IOPS load, or using the sweet spot of a physical disk -- all those are starting to become much less relevant given the new technology.
Logical and physical capacity are pretty much the same thing, right?
No, widespread use of compression, deduplication, thin provisioning, spin-down, etc. means that the gap between "what we see" and "what's actually there" will continue to widen.Storage lives in one location, right?
Not always, not when technologies like distributed cache coherence and global federation (e.g. VPLEX) are fully considered. Storage lives where it needs to, including potentially multiple places at the same time.
Storage administration is done by storage administrators, right?
Well, that's changing too. More and more of what we've always considered "storage administration" is moving to other places in the stack -- the VMware administrator, the application administrator, or the unified infrastructure administrator.
And storage, well, that's about hardware, right?
Maybe yes, maybe no. After reading this next bit, you can be the judge of that.
Storage As A Virtual Machine?
Deconstruct just about any storage array, and you'll find familiar components: storage media, processors, memory, motherboards, power supplies, etc. Indeed, most storage arrays are built out of largely the same parts bin that the server guys use.
Much of the differentiation from storage comes from its software. And anything that can run in software can be virtualized.
Which brings up the question: will the future bring more "storage solutions" that are nothing more than VMs running on a pool of virtualized hardware resources?
Before we get started, when we toss up the word "storage", we're actually talking about an incredibly broad set of use cases. For some of these use cases, the approach of "storage software in a VM is your storage" is very attractive. For others, it's a bit more difficult.
Consider the Celerra VSA
Many of you are familiar with the Celerra Virtual Storage Appliance. It's a virtual machine that turns arbitrary block storage into a fully-featured unified storage environment. It's not HA like the physical Celerra, nor is it as performant as the physical Celerra, which is why it's an eval-only offering, and not promoted as a production solution.
But what if you had another approach to HA that didn't involve hot-failover of controller blades to shared RAID? Or you had another way of offering the required performance that didn't require dedicated physical hardware?
Or, perhaps, your interest is more along the lines of EMC FMA (File Management Appliance) that does policy-based tiering, movement, archiving, etc. -- now available as a VM for production. The virtual edition does everything the physical version does, only it does it using virtualized (vs. dedicated) resources.Now it gets interesting, doesn't it?
Enter Atmos Virtual Edition
Ask people about Atmos storage, and most people will envision big pools of commodity hardware that are geographically dispersed. Well, that's not entirely accurate.
Atmos is essentially software. Developed by a crack team of EMCers, it extends the object-oriented paradigm of storage initially found on Centera into the cloud.
If you're not familiar with Atmos, think in terms of a global, distributed repository of content objects accessed using familiar RESTful protocols. Policy associated with the metadata dictates how the information is stored, protected and secured: how many copies, disks spun up or down, compressed or not, multiple locations or not, encrypted or not, audit and compliance trails, and so on.
Early this year, Atmos added the GeoProtect function, essentially bringing parity RAID concepts to geographically dispersed data. As one result, we can argue that the cloud is actually more resilient against multiple failures than traditional approaches.
Atmos Virtual Edition is nothing more than the Atmos software running in a virtual machine. As a result, it can use existing server and storage resources (including EMC storage arrays) to do its thing.
It overcomes the HA challenge mentioned above partly by using VMware's HA model, and externally reachable storage. And, as mentioned above, it doesn't have to use traditional RAID to protect against storage media failures, using instead a geographically dispersed approach.
So, how does that sort of thinking change the equation?
Cost Of Entry vs. Cost At Scale
If we were simply looking at cost-to-serve at significant scale, I could make a strong argument that a purpose-built Atmos storage cloud could be more cost-effective than using Atmos Virtual Edition.
But that presumes that you've got an Atmos environment that demands scale. And, when new technologies are entering the market, cost of entry can be more important than cost at scale. Things have a way of wanting to start small, and then get big if they're popular.
With Atmos VE, we now have an attractive cost-of-entry proposition. Use available virtualized resources (server, storage, etc.) in two or more locations by simply firing up Atmos virtual machines. Offer the service to your users or clients. See how it goes.
If you get lucky and need more performance, simply throw more resources at existing VMs, simply fire up additional VMs, or use faster storage. Want to get more information closer to your users? Fire up an instance close to them. Want more geographically removed data protection? Fire up an instance that's very far away.
Offering Atmos-style cloud storage services now becomes a completely different economic proposition on the cost-of-entry side.
If you're an enterprise that owns multiple data centers (or even IT closets!), you now can start offering an internal Atmos-style cloud storage service for not much effort. If you're a service provider, you now can offer the same without requiring dedicated infrastructure -- runs nicely as yet-another-task on a Vblock, for example.
Either way, get in cheap and easy, see if it grows, continue to add virtualized resources if you want to, or start adding in dedicated infrastructure alongside virtualized infrastructure.
And that's the power of offering functionality (including certain forms of storage) as a virtual storage appliance -- it tends to lessen the friction associated with new technology adoption.
Stepping Back A Bit
I'm sure that many in our industry will eventually start debating whether or not storage functionality implemented as virtual machines on pooled assets is "better" or "worse" than traditional array-based approaches.
Give it a while, and it's inevitable. And you know that our industry loves a good, spirited debate.
That discussion -- when it happens -- will miss the point as far as I'm concerned. Storage software running on a VM using pooled resources will be just another option to consider when offering storage services. Decide what functionality you want, and then decide whether you want a virtualized or physical resource approach -- or some combination of the two.
I think everyone knows that EMC's entire storage-related portfolio is now entirely Intel based. And anything that is Intel-based is can theoretically be virtualized, no?So, anyone want to hazard a guess as to how many of these "storage as virtual machine" options we're going to see before too long?
For me, it's just another step along the journey of unlearning everything I know about storage.