I always warn people -- be careful of what you ask for, you just might get it.
For years, various competitors have heckled EMC that we ought to be more active and participatory with regards to public storage benchmarks.
There's a lot of juicy insights here -- in addition to the predictable bragging rights -- so let's dig in and see what we've got.
Benchmarks Can Be A Mixed Bag
My real beef with public storage benchmarks over the years is that they're not worth the trouble unless they reflect real workloads and use cases that customers actually care about.
Although I've been critical of the SPC, I've always felt differently about the SPEC -- it's always seemed to be a cleaner, less-manipulatable benchmark that does a reasonable job of reflecting real-world use cases. And, yes, EMC has submitted SPEC results from time to time.
The best way to think about the SPEC test is in terms of simulating a "file workflow".
Imagine you're running a big filer to support, for example, geophysical exploration, or image procesing, or perhaps digital content creation. At one level, you're sort of using the file system as a database. The workflow changes the state of the file system metadata frequently, in addition to the usual reads and writes.
Nick Kirsch, our director of product management for the Isilon team, tells me that the SPEC workloads do a decent job of capturing the "file-based workflow" use case.
The tests are heavily biased towards metadata operations against the file system: create, open, close, rename, move, stat, etc. The resulting traditional I/O operations also have a heavy sequential bias, e.g. dig into this file midway and read a few megabytes, or append this data sequentially at the end of an existing file.
The metadata traffic is significant, the I/O transfers are large, and they're generally sequential.
You get back two numbers to compare: first, the number of file system IOPS, and secondly the average response time between operations.
SPECsfs2008 (the official name of the benchmark) comes in two flavors: one for NFS file systems, and one for CIFS (e.g. Windows-friendly) file systems.
Fair warning: the results can't be compared against each other, nor should they be compared against earlier versions of the SPEC test. And, no, please don't assume that SPEC IOPS are anything like plain, ordinary block IOPS :)
The EMC/Isilon results clearly overwhelm every other result to date, in a couple of interesting ways.
- The absolute numbers are much higher (e.g. we're now talking over a MILLION IOPS, something that's brand new)
- The bread-and-butter Isilon model was used (the S200), not the ultra-high-performance version, so if we need to come back and offer up better numbers at some point, we can -- using existing products
- The results were achieved on completely standard versions of the product -- no lab queens!
- There was no tweaking or tuning on performance, they're largely "out of the box" numbers
- They show predictable and linear scalability from the very small to the very large
- It was all done on a SINGLE file system, vs. the usual aggregation of multiple ones
All the various charts are below for your perusal. So, let's dig in to each piece for a moment.
How Much Faster?
The results -- for both NFS and CIFS -- are *substantially* faster than any other submission to date. Not a little bit, a whole lot. That includes EMC's previous record-breaking submissions using the more traditional VNX-based technology.
The Isilon folks are showing the results two ways: the first way is in a more traditional "aggregate performance" view, where you don't get any credit for doing it all on a single file system. The second, even-more-favorable way is when you show the results on a per-file-system basis.
The ability to do everything in a single, consistent file system matters to many, especially at scale -- we'll talk more about that a bit farther on in this post.
And Using A Standard Model?
Yep. The results were achieved using off-the-shelf S200 modules. Each module had a single SSD for metadata management, plus 23 standard 300GB 10k SAS drives for data storage. There's no great gobs of flash, nor more exotic 15K disk drives either.
Exactly the same sort of balanced capacity-vs-performance model that many of our customers are choosing for their production environments. Take a look at the configuration reports for the benchmarks -- they're about as clean and simple as any you're likely to see from any vendor.
No lab queen here, folks.
And No Tweaking?
I asked Nick exactly how much customization and optimization they had to do to achieve these numbers.
The answer was simple: none.
They did spend some time setting up the environment with instrumentation (using Isilon's InsightIQ) to understand what performance they were getting, and they did play around with a few different protection options, but there's no long list of parameters and procedures to get these results.
Basically, they plugged the stuff in, turned it on, and got the numbers.
One of the beautiful things about the Isilon scale-out shared-nothing approach is its linear scalability. The Isilon team submitted multiple results, each using a different number of nodes. I think the NFS results show what I'm talking about -- the "IOPS per module" number stays basically flat from the smallest to the largest configurations.
Use a little, use a lot -- performance scales linearly and predictably with capacity. Just as you'd want.
Not every environment starts out large, but many end up that way. I think it's great to know that you can grow capacity and performance in bite-sized chunks and with predictable results. If you've ever added capacity to an Isilon system, it's painfully boring: cable up the new module, and basically stand back and let the magic happen: all the resources are transparently rebalanced nondisruptively.
It's one of the more compelling demos in storage-land.
The Beauty Of A Single File System
Any time you've got multiple containers (such as file systems), you've got to make some hard decisions as to what goes where. If things change, you've got to move things around and rebalance. And, since we're talking petabyte-scale, that in itself can be a massive effort.
With OneFS, everything (performance, ports, capacity, etc.) is one, massive, uniform and autobalanced file pool. Nothing -- but nothing -- could be simpler if we're talking files.
And, as I'm starting to discover, utter simplicity really, really matters at scale :)
Why This Matters For Customers
If you're dealing with big data, or just the proud owner of way too many traditional filers (!), you'll probably care about these results. Not only the numbers themselves, but how utterly simply it was to achieve them. The hardest part was probably rounding up all the equipment.
The Isilon results show what can be accomplished with scale-out, shared-nothing NAS using commodity technology and purpose-built software, e.g. OneFS. It's hard to imagine a more traditional filer approach getting anywhere close to these results. We'll see how long these results stand :)
If you're a smaller NAS vendor in the storage industry -- and you plan on selling to these demanding customers -- you've got some serious work ahead of you to catch up.
The race is on ...
(charts, links, etc. below)
A nice graph showing the flat linearity of performance per node
And, if you're still reading, you might be interested in this writeup from ESG.