For decades, we’ve all been intrigued by the potential of scale-out architectures.
Can low-cost computing and storage components be lashed together in some fashion to deliver the advantages of a big machine with reduced costs?
I was first exposed to this line of thinking in the server business maybe 20 years ago. It has come and gone in different forms, but in the last few years has re-emerged as a potentially better way to build storage farms.
Is it? Or is it not?
It depends on what you’re looking for, so let’s dive in …
The Premise
Lots of variations on what appears to be a single theme.
The ingredients are usually the same:
- Commodity components, e.g. servers and included storage
- Commodity interconnect, e.g. ethernet, infiniband, etc.
- Vendor-supplied software layer that does the magic.
The server/storage discussion is pretty well understood. Servers are reasonably priced, as are the storage devices they contain. Not much new here (sorry, Sun!)
I’m not going to wade into the interconnect debate – I have my opinions, but they’re not relevant here.
The more interesting discussion is around the software layer that transforms disconnected components into a cohesive whole.
Maybe it’s a storage operating environment, a clustered file system, or perhaps something similar.
Or perhaps something with more of an application bent – Oracle 10g RAC, the new Exchange 2007, or perhaps a search or data warehousing layer?
Underneath this all, there are really three questions …
- Can the architecture do the job?
- Can the architecture do it more cost effectively?
- Does using the architecture create new concerns?
Can Scale Out Do The Job?
It depends, doesn’t it?
Traditionally, scale-out architecture do best when the application work (and data access patterns) are uniformly random and partionable. The problem is that all of the examples given above can be used two ways.
Some use cases are predictably random. These do well. All the various components are merrily doing their thing, and life is good.
Some uses of the exact same database, filesystem, storage operating environment, etc. tend to beat up on one piece of the data or another. These do only as well as the individual processing node, while the rest sit idle. Not good.
The tough part is that most environments aren’t always well behaved, are they?
You’ll be happy when they act as you expected, and unhappy when they don’t. Just how unhappy – well, that depends, doesn’t it?
Build fatter, more powerful nodes? Build more intelligent, powerful interconnects? Or uber-sophisticated software that tries to ensure that things stay random and spread out?
Yes -- potentially -- but that sort of thing tends to get you away from the low-cost, commodity angle you’re trying to play.
One special case of this concern in a storage discussion has to do with write cache.
Generally speaking, server write cache is volatile (meaning it can go away without warning), so people are loathe to expose themselves to this risk. Storage cache, by comparison, is generally non-volatile, which means it can safely be used to accelerate writes.
This translates into a substantial performance difference on writes in many environments. If write performance isn’t a concern, or you’ve convinced yourself that you can uniformly spread it out, fine. Otherwise, don’t be surprised when it runs slow.
Can Scale Out Do The Job More Cost Effectively?
Again, tough to nail down.
Most people point to the inherent lower cost of commodity components. Not to rain on anyone’s parade, but if you look inside most modern storage devices, you’ll see a lot of the same types of components in use. No real savings there.
Other folks point to the fact that you can buy these components from just about anywhere, so they’re very efficiently sold at good prices.
Fair enough, you can do that. And you’ll have to integrate it yourself, and – of course – support is now *your* problem. Vendors charge for that value-added service. No surprise here.
Personal footnote: I used to handpick my PC components and try to build best-of-breed systems for home use. Today, I drop by the local CircuitCity and pick something out after doing a bit of research. Not as much geek cool, but my life is much easier now.
And then there’s the cost of the magic software layer. Ah, yes. No getting around that, unless you’re up for a real adventure.
My view is that – rarely – is there a free lunch out there. Although sometimes the nicer vendors will take you to lunch ;-)
The sidebar to this discussion is that sometimes someone put a workload on a very expensive piece of computing or storage technology, and later finds out that it could be done much cheaper on a different sort of architecture.
Makes good news in the industry press, but sometimes I think it’s more of a statement about the IT guy than the technology.
Does Using The Architecture Raise New Concerns?
Yes, it does – for most people. I’m going to take it from a storage angle, since that’s what I’m most familiar with. And, just to be fair, I’m going to use EMC’s Centera as an example (sorry, guys!)
Centera, as you might know, pretty much defines the CAS (content addressed storage) segment of the storage market. It’s used as an active archive when you want long-term retention, online access, minimal management effort and compliance-related features.
Very successful product to this day.
If you look at the hardware, you’ll see some familiar elements, basically a rack of dense servers with internal storage, using ethernet as the interconnect. Classic scale-out architecture, in this case marketed as RAIN – redundant array of inexpensive nodes.
And, of course, all the magic comes from the software layer.
But taken together, this is a tightly integrated stack that takes a few options off the table that are worth considering.
First, if you’d like to pool your storage between CAS uses and non-CAS uses, well, we can’t do that. You can pool non-CAS uses, and that’s about it.
If you want to do storage management with CAS, you’ll have to use the Centera-supplied tools. Sure, at some level it integrates with ControlCenter, but you can’t manage it with the same tools you’d use for a CLARiiON or Symmetrix.
If you want to do remote replication, you’ll have to use the Centera Replicator. If you’re thinking of standardizing on SRDF or RecoverPoint, sorry.
Those are just a few examples, but I think you get the idea – it’s hard to manage storage as a shared resource behind some of these environments. Maybe that's an issue for you, maybe not.
Now let’s shift gears and look at Oracle 10g RAC and Exchange 2007 – both which are starting to get customers interested in scale-out approaches.
The same list of storage issues arises.
Hard to pool. Hard to tier. Hard to replicate centrally. Hard to manage centrally.
Surprise, it’s the same tradeoffs.
Same for clustered file systems, and data warehouse environments, and … well, you get the idea.
Your storage strategy just collided with your application strategy. Unless there’s a real compelling reason to go scale-out, you’ll probably stick with a rationalized storage environment, rather than landing something new, different and unconnected on the floor.
The Broader Picture
Another interesting vector to this discussion is information itself, and how the nature of information is changing in many environments. Much of the information growth seems to be in large, rich-content objects that may be infrequently accessed. Think images, documents, voice, etc.
I think scale-out architectures are good for this – right tool, right job (like Centera). And if this is a dominant part of your environment, maybe you’ll be fine with some of the restrictions above.
And, when we look farther out, there’s interesting potential to make storage devices themselves more scale-out in many of their attributes, hopefully bridging the gaps I’ve identified above.
Any way you look at it, scale-out concepts are here to stay. And we’ll see more variations on the theme in the future.
Are they right for you?
Well, it depends, doesn’t it?

Comments