Are public clouds cheaper? Are private clouds cheaper? Under what circumstances is one form of cloud more cost-effective than another?
I don't think we'll ever have the definitive answer, but there's some interesting new data to contemplate, courtesy of the recent EMC analyst day.
For most people doing IT at any reasonable scale, the number shows that private clouds can be significantly less expensive than public alternatives in many situations.
And that's *before* any concerns about availability, security, regulatory compliance, control, etc. are factored in.
What Brought This About
As part of David Goulden's strategy session, he shared EMC's views of where application workloads would likely reside in 2016.
As application workloads go, so go the storage requirements in support of them.
David's model presume two broad categories of applications. The largest count -- by far -- are the traditional applications that are so famiiar in enterprise settings: Oracle, SAP, Microsoft et. al.
Growing nicely, too.
EMC forecasts that -- by 2016 -- a whopping 96% of traditional enterprise applications would either run in an enterprise-owned private cloud, or a "virtual private cloud" where private cloud resources had been transparently extended by an IT service provider.
For the newer "cloud apps" that are being created, it's an understandably different split, but our best forecast still shows an impressive 61% of cloud apps running in some form of a private cloud, vs. 39% running in a public cloud.
While there are a long list of potential reasons as to why someone might prefer running a workload in their own environment vs. a shared public one (security, control, compatibility, compliance, etc.), there's one factor that everyone pays attention to -- and that's cost.
A Disclaimer On These Numbers
This model was created by EMC for internal planning purposes -- a synthesis of many studies and data points. When you're doing internal planning, it pays to be brutally honest with yourself. To be clear, the numbers were not intended to be spun to score points with one audience or another.
They are modeled after the needs of moderate-to-heavy IT users, and not smaller or occasional consumers of IT services. They presume operational efficiency for both sides: public cloud operators as well as enterprises using private clouds.
No, I can't share all the detailed modeling and assumptions that went into each of these -- simply because it's proprietary. But I can share some reasonable logic as to why the numbers are the way they are.
And -- finally -- these numbers are not intended to make a statement regarding actual costs a specific enterprise might experience. But hopefully the numbers presented here will incentive people to go determine their own true costs for each approach.
The ERP Example
One very familiar workload in many organizations is ERP -- or its functional equivalent. So many businesses have a core application responsible for booking customer orders, shipping products and subsequent invoicing.
We're looking solely at infrastructure and related costs here.
It makes sense to start with the middle column shown -- infrastructure charges (in thousands of dollars per month) for a decently-sized, "always on" ERP environment driving significant average CPU usage -- say, 80% or greater.
Note the expected availability of 99.85% for the public cloud -- almost "three nines". The column on the right shows the costs associated if a second availability zone is configured -- more expensive in exchange for an availability level arguably somewhat greater than 99.85%.
The third column on the right shows the monthly expenses if one were to use a private cloud approach. Representations are made for the monthly cost of compute, network, storage and administration. Power, cooling and facilities costs are factored into the costs of each infrastructure component.
Note how the compute cost is lower for the private cloud model.
While the big public cloud operators probably pay incrementally lower costs for their server iron; the more-that-offsetting difference is that an enterprise doesn't have to make a hefty gross profit "reselling" CPU cycles back to the business.
Next up -- network costs: much lower for a private cloud approach -- for obvious reasons.
Beneath that -- note that the monthly charges for storage are less than for either cloud alternative, single availability zone or otherwise. That might raise a few eyebrows, so let me speculate as to why that is.
First, as anyone associate with storage knows, the big cost driver is storage media. The big operators buy bare disks and plug them into servers designed to hold storage. And -- once again - an enterprise operating a private cloud isn't required to gross up unit costs back to their internal users.
But I think there might be more here.
So many public cloud storage environments guard against storage media failure by simply making full, redundant copies of data. And, compared to the efficiency of a parity RAID approach, a awful lot of media gets wasted that way.
Finally, administration costs are notably lower for public cloud than the private flavor -- although I would assume that the cloud operator has to price their internal administrative costs into CPU, storage and network -- further expanding the unit cost differences.
The infrastructure cost winner for ERP? A private cloud -- not only more cost-effective, but potentially more available, more secure, more compliant, etc.
"Tier 4" Storage
No, that's not an official designation -- just an informal term to refer to a large amount of storage with very modest levels of performance, as well as some measure of data protection to guard against disk failures.
Here, we've compared a public cloud storage offering against two EMC alternatives: a cost-effective configuration of a VNX5300 as well as a very small Isilon cluster.
The first cost differential that jumps out is network costs.
If you want to actually use your data, that's going to cost you money in addition to storage costs. Of course, if your intent was to write once, read never (WORN) the number would have to be adjusted accordingly.
Notice the difference in pure storage costs. Enterprise owners of private clouds don't have to mark up unit costs back to internal users. And both the VNX and Isilon support parity protection, making for more efficient use of storage media.
Hadoop Workloads Are Different
The difference hinges on whether the workload is "storage intensive" (lots and lots of data, proportionally modest amounts of compute, or "compute intensive" (heavy analytics and modeling with more modest amounts of information).
And the world does have both -- although storage-intensive Hadoop-style workloads seem to be in the majority as far as I can tell.
Private clouds show lower storage costs (for likely many of the same reasons shared above) -- compute is close to a wash -- but admin costs are notably higher for a private cloud.
But the tables are turned when the focus shifts to compute-intensive Hadoop-style workloads.
Their compute-intensive and inherently bursty nature puts a public cloud ahead: storage costs are less prominent; and being able to grab hundreds (or perhaps thousands) of cores on demand cements the case for a public cloud approach.
But there's a buried assumption here that's worth mentioning -- the assumption that, although bursty, you're keeping the CPU farm an average of 40% busy over time. As the number rises to, say, 50% -- the private / public cloud costs become a wash.
And, of course, if your average is less than 40%, public clouds will look even better.
What Does All This Mean?
It's helpful to think of things in parts.
Private clouds don't have that requirement.
Then there's storage -- it's not a variable resource the way CPU or perhaps network bandwidth can be. Once you write data to it, it's yours and no one else's -- no sharing! And public clouds don't typically support parity protection mechanisms; private clouds do.
But compute can be a variable resource -- whether it tips the equation in favor of a public cloud approach over a private cloud boils down to (a) the relative proportion of compute vs. storage, and (b) the burstiness of the workload.
For many enterprise workloads, the workloads are not exceptionally compute intense, and their variability is not the dominant factor in the overall cost equation -- ERP is given as an example here, there are many others. Compute in a private cloud can be cheaper than compute in a public one.
But for very specific bursty workloads (compute-intensive Hadoop-style applications are the example here), public clouds can offer a cost advantage.
To be fair, this comparison presumes that a private cloud can be operated with similar efficiency to a public one -- and that involves an organizational and operational model that's proving to be so difficult for so many IT organizations.
The One Thing To Take Away
I'm not suggesting anyone take this internal EMC study as the final word on any of this.
If anything, this sort of analysis should provoke anyone running IT at moderate scale to understand *their* unit costs to deliver an IT service.
And that's a discussion that's always worth having :)