One of the most frequent questions I get from EMC customers in our briefing center is "how are we doing?".
They're curious to know how well they're measuring up in managing their growing storage and information management environment as compared to other IT shops.
Now, any sort of rigorous, scientific answer is beyond my scope. But what I can do is share with you some of the broad averages as I see it from my perspective.
And, yes, many of them think they've made real progress on taming their storage beasts.
The History
If we go back five years or so, the general state of affairs was not ideal, I'd offer. Capacity utilization was generally low in the industry. Storage management was difficult, to say the least. Costs were growing faster than value received -- a less-than ideal situation for anyone who builds (or uses) technologies.
But a lot can change in five years, and I'd argue -- for many IT shops -- it has changed for the better.
The Standard Disclaimer
There's some built-in sampling bias here. For example, many of my impressions are gathered from people I've met in the EMC briefing center. That presumes that (a) you're an EMC customer or likely to be one soon, and (b) you're serious enough about this stuff to fly out to Hopkinton, MA. and visit with us. Now, to be fair, I get to talk to all sorts of companies, but there's a certain pre-selection that goes on before I usually get a chance to chat with them.
Just for the record, I don't get many of my impressions from industry publications or analysts. They try, but over the years I've noticed that their observations don't often square with what I see in my travels.
So let's get started ...
Are You Using Service Catalogs?
Most of storage and information management can be thought of as a service provided on behalf of an application, a user, or a management function.
For several years, the most advanced practitioners have been constructing service catalogs for different aspects of their operation.
At a conceptual level, a service catalog describes a list of services that vary along one or more axes, and have different explicit or implied costs associated with progressively better levels of service.
Whether or not IT actually charges back for the services delivered seems to have less importance than understanding the true costs associated with different service levels, and to make sure that consumers understand the cost differentials associated with their service consumption.
When it comes to storage service catalogs, the anecdotal number is that doing so results in somewhere between 20-30% cost reduction, simply by constructing a service catalog, creating cost-optimized tiers, and making sure that business users know what they're asking for in terms of relative costs.
The Base Catalog: Storage Performance and Availability
The first storage-related service catalog most IT organizations construct has to do with the storage media itself. Not surprisingly, different storage media offer widely different performance and availability characteristics. Terabyte drives with RAID 6 protection are very different than a mirrored pair of 15k 300GB drives -- or an enterprise flash drive -- for example.
Availability not only includes notions of media redundancy (e.g. RAID levels), but things like dual-pathing, redundancy of the array itself, and so on.
Typically, most organizations end up with 4-6 different storage service levels for performance and availability.
These sorts of basic service level catalogs became popular around 4 years ago. Today, EMC has done thousands and thousands of these for our customers, and I'm sure many more thousands have been done independently. But not everyone has taken this important step in their shops yet.
As side benefit of these catalogs is that, as new technologies become available (drives, flash, dedupe, etc.), these can be viewed as new alternatives for established service levels, e.g. a faster fast, a cheaper cheap, and so on.
More recently, we've seen another level of tiering done around data type, e.g. different tiers of block storage, different tiers of file storage, and in some cases different tiers of object storage. This can lead to intelligent discussions around FC vs. iSCSI, for example, or the need for Tier 1 NAS in some situations.
Additional Catalogs: Recoverability, Aging and Retention
On top of these base media catalogs, what people usually do next is construct additional services on top of the media.
Not surprisingly, recoverability is usually the next one up. The two key concepts here -- RPO (recovery point objection, or how much new information do you want to lose in the event of a corruption?) and RTO (how long do you want to wait to get your information back?) create huge tradeoffs in service levels and associated costs -- perhaps more than the storage media discussion.
The difference between a once-a-week incremental that takes a day to restore and real-time replication to multiple sites can add a zero or more to the total cost of protecting a given amount of information. To make matters worse, most business users of information protection have no idea how expensive this stuff can be for IT to implement.
A closely related concept is aging and retention. A three-week old email can support a much lower service level in terms of performance, availability, RTO/RPO etc. than one that's just an hour old. The same is generally true for user files, database transactions, and so on.
The result is usually two additional catalogs -- one that speaks to different protection levels in terms of RTO and RPO, and a second one that describes transitions between underlying service levels at different points of time, e.g. moving from primary to archival.
Finally, the hardest question to answer in this activity is "when can you delete the data?". It turns out this is perhaps one of the most emotional topics for business users to wrestle with, leading to a prevelance of approaches where information is kept at very low cost and very long RTOs to minimize costs.
But the result of this service catalog exercise is much the same as preceding activities: for protection, a nice grid that spells out the different service levels (RTO, RPO and sometimes performance impact during backup), as well as associated costs to the business.
And a second one that states default policy for moving classes of information (usually downward) to progessively more cost-effective service levels.
Once again, as vendors come up with different ways of providing RTOs and RPOs (or moving information between service levels), they can be considered in the context of the catalog, e.g. data dedupe backup, CDP for replication, virtualization for moving files or blocks around, and so on.
The significant majority of customers I talk to have established either formal or informal service level catalogs in these three categories -- base performance/availability, recoverability and aging/retention.
But in the last year or so, we've seen the concept be applied in new and potentially useful ways.
"Extra Credit" Service Catalogs
Since the concept of service catalogs has been around for a while (and has done well), there are some interesting ways to use the concept to wrestle with higher-order services.
Compliance -- especially regulatory compliance -- is one popular area.
Not all information in an IT environment is subject to compliance policies, so segregating that information out to create different "compliance service levels" is popular. More recently, we've seen new requirements for providing audit trails of who's seen what information, creating the incentive for more use of this type of service catalog.
The same goes for encryption for data at rest, whether on tape or disk -- not everything requires industrial strength encryption and key management, and -- as it's a cost driver -- it makes sense to expose those costs to the people who think they need this sort of thing.
I've also seen the concept of service catalogs being applied to storage management itself. The act of provisioning more storage, for example, is a delivered service, and can happen very quickly, or very slowly, as the case may be. Again, exposing true costs to the business can be useful if you find yourself with a backlog of provisioning requests.
Along the same line of thinking, some shops are treating requests for new server types, operating systems, databases, file systems, et. al. as service requests, and constructing tiers of service responses for just how quickly they can incorporate these into their environments, and provide all the other associated storage and information management services.
Ditto for utilization reports, performance analysis, and all the administrivia. You want an answer right now? It's going to cost more than if you can wait a bit for it.
The Value Of Service Catalogs
Frankly speaking, I can't see how IT can run any other way than exposing true costs back to the business and forcing tradeoff decisions. There will never be enough IT budget (capex or opex) to make everyone happy.
Regardless of whether IT actually charges for the service or not, it forces two very important discussions. One is between IT and the business understanding true costs.
The other useful discussion is within IT itself -- they can now look for more cost-effective mechanisms (products and processes) to deliver popular service levels.
Most of the preceding conversation has been around the "how" that I've seen. But I also think it's worthwhile to spend a moment on the "what" as well.
Technology Deployment Patterns
Given EMC's broad product offering, it is rarely the case (at least, in the customers that come visit us) that are using a single platform type, e.g. CX only, or DMX only. Ditto for storage networking protocols -- almost everyone is using FC and either NAS or iSCSI in some degree.
The same discussion applies to recovery -- they're generally using more than one backup product, more than one replication technology, and so on. Again, the profile here is larger shops that have broad requirements, and this might not apply to everyone.
Surprisingly, a large number of these shops consider themselves "single vendor shops". Sure, they'll have a bit of product from other vendors on the floor, but -- to the greatest extent possible -- they try and work with as few different vendors as possible.
Why? I think they've realized that, with every additional vendor, there's an additional cost associated. An additional set of product briefings to sit through. A different support structure to get to learn and know. An additional set of negotiations when it comes time to buy something.
Although there are those in the industry that argue vehemently for multi-vendor strategies (usually to get a better discount), I just don't see it in the circles I travel in. Rarely will someone describe themselves as "dual vendor" for arrays, SANs, etc. -- unless the purchasing department is also running the IT strategy :-)
Just about everyone is using some form of local or remote replication. Just about everyone is using some sort of MPIO (multi-path I/O) to improve performance and availability, usually PowerPath from EMC. Just about everyone has done some sort of archiving project around email and/or files. Just about everyone is doing some form of disk-based backup or archiving.
And just about everyone is using a SRM package (usually ControlCenter) to manage their storage environment. We don't see too many folks who try to do it entirely with scripts these days.
The Result
Given the emerging level of storage and information management proficiency in many shops, what do we end up talking about?
For me, it usually is "what's new" and "what's next".
The "what's new" discussion is around a specific technology or approach that delivers an optimized service level in a significantly new or better way.
Recent examples from the EMC portfolio would include things like Avamar which is optimized backup for VMware environments. Or RecoverPoint for integrated local and remote CDP. Or enterprise flash drives for the ultimate in storage service levels. Or perhaps MPFS for high performance NAS.
The fact that there's either an explicit (or implied) service level catalog makes these discussions very focused, narrow and hence productive, And we avoid talking about things that aren't interesting to our customers.
This leaves more time for the "what's ahead" discussion, which generally boils down to new services that IT might have to provide in the future -- sometimes the very near future.
Examples include DLP -- data leakage prevention -- the need to scan data either in flight or at rest -- and figure out if you've got a problem.
Or it's a discussion around how widespread VMware deployment will create a case for infrastructure and processes that are optimized for virtualization, rather than simply adapted.
Or, in some cases, it's a discussion around providing secure access to the growing cadre of contractors or vendors that need access to sensitive bits of your infrastructure to do their work -- and giving them this access easily and effectively.
Or maybe a discussion around how some information management services are becoming attractive candidates for an external service provider.
Or, perhaps, how the IT group is being inevitably drawn into their new role around information governance.
A Final Note
On a personal note, it's been fun to see the discussion evolve from technology to products to methodologies to new roles for IT.
And I think taming the storage beast is a key step along the way.
How are you doing on taming your storage beast?

Comments