As more and more people consider bigger and bigger workloads running in virtual containers, the discussion inevitably turns to I/O throughput and latency.
Now, VMware is no slouch at driving big I/O workloads. I shared one proof-point discussion from testing done last year in this regard on a previous post.
But today, VMware took a big leap forward in this regard with EMC's announcement of PowerPath/VE.
So, why should you care?
A Short Primer on MPIO
The acronym stands for "multi path I/O", and the concept has been around for many years.
From an over-simplified perspective, imagine a server making I/O requests to a storage device. If I only use one I/O path, every I/O request has to take their turn. Use 2, 3 or 4 I/O paths, and the situation gets better -- if, that is, you've got a piece of software that can schedule the right I/O for the right path at the right time.
As with most foundational ideas, it appeared first on mainframes (doesn't everything?) and then made its way over time to other environments.
Most operating systems (VMWare, Windows, AIX, HP-UX, Solaris, etc.) have some base MPIO functionality that's more oriented towards availability rather than performance, e.g. if a path fails, reschedule the I/O for a surviving path. A few may have a simplistic round-robin scheduling algorithms in an attempt to load balance, but these generally aren't too effective.
EMC has been shipping PowerPath (advanced MPIO software) for many, many years. There are literally hundreds of thousands of licensed copies out there. PowerPath supports virtually every operating system, and arrays other than just EMC's.
If there ever was a de-facto standard for MPIO in the storage marketplace, it'd be PowerPath.
Lots of nice features with PowerPath, and they work pretty much the same way everywhere you use it -- the ability to automatically restore failed paths, the ability to (optionally) support things like in-line encryption, and data migrations -- very handy stuff to have in your utility belt.
But -- far and away -- the best part of PowerPath is that is makes I/O bound applications run really fast.
PowerPath/VE == No Waiting
I/O profiles can be wild and unpredictable beasts. It's hard enough to nail down an I/O profile for a given application, but start consolidating multiple applications on a single server (think VMware), or -- better yet -- start moving workloads between servers using DRS, and -- well -- attempting to take a static, pre-determined approach to I/O optimization is almost futile.
That's where PowerPath shines -- it has a wide variety of optimization algorithms, and adjusts dynamically based on what's happening Right Now. All I/O paths are perfectly balanced at all times. Keep in mind, this isn't about bandwidth, it's mostly about response time -- which is what users see.
Better yet, if you determine that there's a bottleneck between server and storage, simply configure another path. PowerPath figures out the rest -- it's just about that easy.
So, What's The Impact?
YMMV (your mileage may vary), but a good rule of thumb is that we can see roughly 2x I/O performance in a serious, I/O-bound workload that's using 3-4 paths. Now, not everyone has serious, I/O-bound workloads that require this sort of aggregate plumbing, especially in VMware environments.
But, consider that vSphere is now capable of sporting a virtual machine that has 8 virtual CPUs, 256GB of RAM and can drive 200k+ IOPs. I think that would qualify as a serious, I/O-bound workload?
VMware Grows Up
Sure, there's plenty of VMware usage that's comfortable with iSCSI, or NAS, or whatever -- no problem there.
But, if you've signed up to the "virtualize everything" mantra (we at EMC most definitely have!), you're going to want the ability to drive very serious workloads very efficiently.
And we're more than pleased to offer PowerPath/VE to help support people who really want to push the technology.
My Personal Wishlist?
For those of you familiar with how PowerPath works, you're aware that it uses a loadable module architecture that allows all sorts of nifty extensions to be dropped into the I/O path, usually non-disruptively.
An important one of those extensions is encryption -- using RSA engines and key management, you can encrypt storage on a per-server basis, which turns out to be incredibly useful in a wide variety of situations.
Now, how cool would it be if you could selectively encrypt virtual machines -- and their information stores -- as they move around and share resources with other workloads?
Here's hoping that we can do this sooner than later :-)