I work in the information technology business, and I work in marketing, so it's no surprise that I've become an armchair analyst of many different marketing styles.
It's also no surprise that I often find much to find fault with regarding how the good people at NetApp go about their daily activities of building, marketing, selling and supporting their products.
It's more than picking at a few nits: it's a fundamental style issue for me. There are good ways to do things, and not-so-good ways to do things.
And, once again, I'm sorry to say that I have yet another example of what not to do.
The Background
A few days ago, NetApp proudly announced that their V-Series (kind of a gateway for existing storage, similar to IBM's SVC) could attach to -- and deduplicate -- information stored on an EMC storage products, specifically Symmetrix and CLARiiON.
Much hooting and hollering accompanied this festive event, including giveaways of t-shirts and mugs, as well as all sorts of crowing in the blogosphere.
Good for them.
Now let's move on, and take a more serious view. Imagine I'm an EMC storage customer, and I'm curious about this latest offer from NetApp.
Sounds interesting, doesn't it? But -- if I was one of these customers, wouldn't I have a few questions?
So, play along with me. Let's say that the good people from NetApp have come calling to share the great news of their latest offering.
What questions would be asked, and what might be the answers?
Question #1 -- What's Involved In Installing One Of These?
Probable answer: significant heavy lifting.
You're introducing a new pair of devices (remember, most EMC products are used in HA environments) into the data path. You're going to have to get very intimate with qualification matrices, and probably do some work upgrading/downgrading HBAs, drivers, etc. as well.
Many new connections are configured. Servers will go up and down, maybe multiple times.
You'll have to figure out a way to copy your data from where it is to where it'll be, and -- if your disk array is pretty full (a likely assumption) -- you'll probably need extra "swing" capacity to make the move happen.
I must be misunderstanding this.
Now, since I'm often accused of not having a clue on how NetApp's products *really* work, can someone please enlighten me on the step-by-step procedure for bringing over a single, HA-configured server to this environment, if extra storage is required, and how many hours this might take per server?
Let's assume something like, say, 500GB of production data. Are we talking something like 20 hours per server?
More?
Question #2 -- Does My EMC Software Work With This?
If I'm an EMC customer using a Symmetrix or CLARiiON, I'm probably using a bunch of EMC's software for multipathing, HA, storage reporting, business continuity, local cloning, etc. The software is bought, paid for, and now an integral part of my operation.
Does any of it still work?
The frank answer: I don't think anyone really knows, since no one has probably tested any of this. I know EMC hasn't; I doubt that NetApp has.
Maybe they're hoping customers will do this for them.
The product list would include software like PowerPath (host-based multipathing, load-balancing and more recently encryption), SRM software like ControlCenter, replication software like TimeFinder and SRDF for remote replication, and perhaps a bunch more.
There aren't too many of our storage platforms out there that don't use multiples of these products. No explanation that I can find that any of it might stand a chance of working well in this new environment.
So, my apologies in advance if I'm wrong about this, good people at NetApp, but does any of the EMC software that customers have paid for, and are probably using today -- is any of that gonna work?
Question #3 -- Does EMC Support This?
Many EMC customers are used to the one call, one-throat-to-choke, end-to-end enterprise support model. They tell me it's a nice arrangement.
Does this arrangement continue under these circumstances?
The short answer is "no", not the way you're used to.
EMC hasn't qualified or tested any of this, so if someone calls us with a nasty problem on this stuff (or anything like it), the only reasonable thing we can do is offer to try and help -- best efforts only.
EMC's eLab publishes the de-facto qualification matrix in the industry -- if it's on the support matrix, we support it, and own any problems that come up.
If it's not on our matrix, we try to do our best to help -- but no guarantees, though. Sure, we'll support our storage platforms. But we won't be able to help you on the end-to-end stuff: there's a device that we've never seen somewhere in the middle.
You can always call NetApp, of course. I'm guessing they know they need to make some expensive investments to support these sorts of environments at an enterprise level.
I'd expect to see them own at least one Symmetrix from each major vintage that's out there in the field: Symm 4, Symm 5, DMX, DMX-2, DMX-3 as well as the latest DMX-4. As well as a similar list for CLARiiON.
As well as some fairly hefty configurations of older and newer Brocade / McData / Cisco switch fabrics. A nice server farm to go with it. And a nice posse of technical types to run it.
So, would someone at NetApp care to share the precise support investments and processes they're offering in this multi-vendor environment?
Maybe just an inventory of the specific equipment they've got in place?
Question #4 -- Are There Any Performance Impacts?
Not surprisingly, most EMC storage products get used in environments where performance matters -- even older ones.
I would expect that if there were any potential degradation in performance or response times through the use of this approach, customers would want to know so they can evaluate the potential impact.
Maybe it ends up mattering, maybe it doesn't -- but there ought to be some statement from NetApp in terms of (1) performance impacts of introducing a V-Series server-based appliance in the data path of an enterprise array, (2) the impacts of a file system on throughput and response times, and -- of course (3) any overheads associated with data deduplication itself.
I'm sure NetApp did extensive performance testing about the before-and-after impacts of their proposed solution, and they just forgot to share.
Looking forward to seeing this soon, folks ...
Question #5 -- Do You Have A Process or Tool For Measuring The Potential Savings?
We all know that the benefits of data deduplication are highly variable depending on the nature of the data, change rates, etc. And we also know that people tend not to use higher-end Symmetrix platforms as dumping grounds for user files.
If I were a customer, I'd want to get some handle as to the amount of data reduction I could expect in *my* environment with *my* data. I don't think slick powerpoint "estimates" will do the trick here.
I'm sure that -- as part of this announcement -- NetApp will be glad to provide a simple tool or service to help customers understand this better, much the way EMC does when we propose new solutions.
Yes?
Question #6 -- And, Ultimately, What's The ROI?
We don't see a lot of IT experimentation in today's economic environment. Just about every IT investment has to be backed by hard numbers showing the benefits.
And I'm sure that the good people at NetApp realize this as well.
Now, proposing a pair of these V-Series thingies in front of a Symmetrix or other EMC platform is an expensive and somewhat complex proposition. You've got the cost of the NetApp kit, plus an extensive amount of labor and effort, not to mention server disruptions, migrating all that data, and so on. Not an afternoon project, if you get my drift.
And, of course, you've got to be comfortable with the various risks associated with this approach.
Let's ballpark it at -- say -- a couple of hundred thousand dollars -- all in. Sure, it might be more, but let's say you can get some great discounts and persuade the helpful people at NetApp to do most of the heavy lifting gratis.
Now, how much "excess storage" are you going to have to "save" to make this an attractive project?
50 TB? Probably too small. 400-600 TB? Maybe we're getting in the range. And if they could tell you they could save you a PB or more, well, I'd be listening as well ...
But I don't know, so I'll frame this in the form of a question: how much storage capacity do I have to save through deduplication before this becomes a proposition that's significantly ROI positive?
Anyone from NetApp done the math on that one?
Now, There Will Be A Few People ...
... who don't need answers to any of these questions.
There are many thousands and thousands of Symmetrix units out there, with ages ranging from brand-new to over a decade. Some vendors have even had a bit of fun by videoing really old ones being sent to the crusher.
And, I'm sure that at some point, someone somewhere might see a bit of value in taking an old bit of storage left over after the latest migration, and perhaps pressing it into a new role. Nothing wrong with that.
But I don't think NetApp intended their announcement as the ideal answer to what you might do with your 7-year old Symmetrix or CLARiiON array that you've just migrated off of.
The Bottom Line?
I strongly suspect that there aren't good answers to *any* of these questions. But, with all due respect, I'm waiting for my good friends at NetApp to one again dispel my incorrect perceptions.
Please -- set me straight. I'm all confused again.
And, if you don't want to answer me, that's fine. Consider this a preview of what you'll eventually be asked by customers and prospects. Maybe even a few smart industry analysts and reporters.
Here's where I get really irked: this smells and feels like another marketing stunt from the good folks at NetApp. High entertainment value; low business value.
They either don't know about these issues (not a pretty picture), or they just don't care (not a pretty picture either).
I'm also guessing that they really aren't planning to sell much of this product at all -- which begs the question of why are they doing it at all?
I'm sure this sort of marketing stuff makes the troops at NetApp feel good about themselves. Score one for employee morale. And, yes, I'll give them points for cleverness.
But score no points for usefulness, at least with this one.

First let me state that my employer is a customer of both EMC and Netapp, along with HDS and IBM. I have no affliation with any vendor, in any way.
Having said that, we are probably not representative of the particular type of company that this post is aimed at - companies who might be considering moving off direct attached SAN storage to Netapp V-series NAS heads.
We have already done that, and it was an incredibly beneficial thing. The company's entire Windows file serving environment was moved from a SAN attached Windows clustered file server to a clustered Netapp V-series. The back-end is IBM DS8100, but that's neither here nor there really.
Other than CIFS file serving, the rest of our critical apps are still FC SAN, and we are not considering changing that.
I'm not here to hype Netapp's gear, but I just wanted to respond to some of your points, from an "already sold" customer's perspective :
"Question #1 -- What's Involved In Installing One Of These?"
Yes, it was a long, drawn out process moving from the Windows file server cluster to the Netapp units, but it was definitely worth the effort.
"Question #2 -- Does My EMC Software Work With This?"
I don't know. I can't see how it's relevant.
"Question #3 -- Does EMC Support This?"
Again, not really relevant in my case. I have Netapp NAS heads in front of IBM disk. But IBM certainly support their disk system, and Netapp support their NAS heads. It's all working out pretty well so far. The NAS head (V-series) is really just another server, as far as I'm concerned, and I would expect my disk vendor to see it the same way. If EMC won't support their disk just because there's a Netapp box accessing it, then... yeah, that's a problem. The question should probably be "Does Netapp support the V-series attaching to EMC disk", and I'm pretty sure they do.
"Question #4 -- Are There Any Performance Impacts?"
Of course. They are fairly frank about it in their FAQ ...
8. IS NETAPP DEDUPLICATION SUITABLE FOR PRIMARY STORAGE?
Yes for “light duty” primary applications. What we mean by light-duty primary storage are volumes that contain primary (1st copy) data, but are not performance-driven. Some examples of this would be VMware VM’s, home directories, document directories, and application volumes that experience heavy I/O loads during the day but are quiescent at night and on weekends. These volumes might very well benefit from deduplication if the system has the performance headroom to support the additional overhead imposed by deduplication.
10. IS THERE ANY WRITE PERFORMANCE OVERHEAD AFTER ENABLING DEDUPLICATION ON A VOLUME?
As data is stored on a deduplication-enabled volume, digital fingerprint files are also stored. Less than 10% write performance overhead is required for this process.
11. IS THERE ANY READ PERFORMANCE OVERHEAD AFTER A VOLUME IS DEDUPLICATED?
When data is read from a deduplication-enabled volume, the read performance penalty will vary depending upon the original vs. deduplicated block layout. Unless the data has been written sequentially, the read impact would be minimal. However, if an application depends upon fast read performance i.e. sequential block recording, deduplication’s impact on read performance should be carefully considered before implementation.
12. CAN THE SYSTEM PERFORM OTHER OPERATIONS WHILE DEDUPLICATION IS RUNNING?
NetApp deduplication runs as a background process and the system can perform any other operation during this process.
"I'm sure NetApp did extensive performance testing about the before-and-after impacts of their proposed solution, and they just forgot to share. Looking forward to seeing this soon, folks ..."
Well, the points above from their FAQ are much more than I usually see from most other storage vendors.
Question #5 -- Do You Have A Process or Tool For Measuring The Potential Savings?
Again, from the FAQ :
19. CAN I ESTIMATE MY SPACE SAVINGS BEFORE INSTALLING DEDUPLICATION?
Yes. A space savings estimation tool (SSET) is available to NetApp and Partner SE’s.
Question #6 -- And, Ultimately, What's The ROI?
Well, for a company like the one I work for who is already using V-series, it is almost a no-brainer. It is apparently a no cost license key I have to type in once, then I simply setup a schedule to do the deduplication overnight. If it reduces our NAS usage of raw, tier 1, FC attached enterprise disk from 11TB to 8 or 9TB, it is most certainly worth it.
Would I migrate from some other file serving environment to Netapp just for this feature? Probably not - it would depend on the circumstances. But, being someone who has already done that migration, I see this as a VERY good thing. I will just wait until it's been in the field for a little while before implementing.
One of the major pain points with Netapp is the overhead required to support their filesystem and the features that go along with it (like snapshots). We have over 11TB of raw disk attached to the NAS heads, but there is only about 8.5TB of usable storage. This deduplication feature can potentially offset some of that pain.
By the way, the FAQ is here. http://communities.netapp.com/servlet/JiveServlet/downloadBody/1060-102-3-1327/FAQ%20NetApp%20Deduplication%2006_11_08.pdf
I think you need an account with the Netapp support site to access it.
Regards
Dean
Posted by: Dean | August 02, 2008 at 03:40 AM
Like Chuck, I've been accused of Not Having a Clue about NetApp, so please - educate me as well...
How many of the V-Series gateways will be needed to front-end a 100TB usable capacity Symmetrix DMX-4 serving 10,000 LUNs to 2500 hosts over 48 4Gb Fibre Channel ports (dual-pathed as Chuck described) for 100% database applications (a mixture of Oracle, Exchange, SQL Server, etc.) with typical 90-95% capacity allocation?
And if these are all using another 30TB usable for TimeFindar/Snap for point-in-time snapshots (which are used for nightly backups to a separate de-duped disk library), how much more or less capacity will be required AFTER the introduction of these V-Series without loss of performance or usable capacity?
Enquiring minds wanna know!
Posted by: the storage anarchist | August 02, 2008 at 07:41 AM
Hi Dean, thanks for your comments -- they're very helpful.
The use case you describe is neither the use case I describe, nor the one promoted by NetApp. It appears you used a V-Series filer to replace a Windows filershare (e.g. NAS).
NetApp's announcement was around block storage applications (e.g. SAN). And they're very different. As you point out, your critical apps are on the FC SAN, and you're not contemplating changing that.
Thanks for attempting to answer the questions, although it was for a very different use case. A few thoughts:
Question #1 -- How Much Effort -- you shared that it was a very lengthy process. Glad it was worth the effort for the NAS move.
Question #2 -- Does My EMC Software Work -- obviously, there were no EMC products involved in your NAS environment, so of course it's an irrelvant question. But if you were using any of the software that is often used with the DS8100, then you'd have a similar problem. Obviously, this was not the case.
Questions #3 -- Does EMC Support This -- Again, the supprot model you describe might be acceptable for a NAS model, after all , it's just a server, as you describe. But I also think it's fair to point out that many customers would find issues with this support model were they considering a SAN use case.
Question #4 -- Are There Any Performance Impacts -- thanks for sharing the NetApp guidance, and I agree, it's better than nothing. However, almost none of it would apply to a SAN use case, and some of the advice appears either incorrect, or insufficient: e.g. there are some VM images that do very heavy work.
Question #5 -- How Much Capacity Will I Save -- didn't know that NetApp had a tool -- I wonder how well it works on block-oriented data such as Oracle databases, Exchange environments et. al. It'd be interesting to find out.
Question #6 -- Is It Worth It -- OK, I agree, if the V-Series is "free", already installed, and all I have to do is type a license key, and there weren't any performance problems -- you're absolutely right, it's a no brainer. But's that not what the NetApp guys were offering, right?
And going from 11TB to 8 or 9 is nice, but wouldn't have paid for the devices, the effort, etc. -- kind of illustrating my point. And I'd be very curious if there was another way of getting that reduction, e.g. asking people to clean up a bit.
The overhead of OnTap and WAFL are legendary. For some reason, the NetApp guys don't like to talk about it much. But you're right, it's part of the equation, isn't it?
Thank you so much for commenting and sharing!
Posted by: Chuck Hollis | August 02, 2008 at 10:24 AM
Hey Chuck,
I could be mistaken but I don't think Netapp dedupe works with Netapp SAN.
The LUNS presented by the Sym, DMX or CX to a V Series are written with Netapps NAS file system. LUN's in the netapp world are really just Files on top of that.
(You can prove this out by putting a CIFS share on the root of a netapp volume. You'll see the .LUN file and you can write a "DEAR_GRANDMA.DOC file right next to it in the file system)
Anyway, I digress.
This LUN as a file means that dedupe occurs at the NAS level and not the SAN level for Netapp.
With all of the performance limitations associated with that.
Posted by: Robert D | August 04, 2008 at 12:08 PM
I think the use case Dean is talking about is more typical. I don't think anyone is seriously talking about fronting DMX-4 flash drives with a netapp v-series ;)
More here:
http://searchstorage.techtarget.com/news/article/0,289142,sid5_gci1323253,00.html?track=sy60
"While it's still not likely to be used with a new 3PAR InServ or EMC Symmetrix, "If you have a lot of other legacy storage systems or more conventional storage systems, you can use V-Series with dedupe to repurpose them," said Forrester Research analyst Stephanie Balaouras.
"This won't be a flagship piece of our portfolio -- just another option for customers," Cummings admitted. "
Posted by: Scott Harney | August 04, 2008 at 01:35 PM
Very much agreed, Scott.
And it does make sense (as I pointed out in the post) that -- sure -- if someone has some older tin that needs a new life, that makes sense.
But that's not how it was presented by them, and there's my beef in a nutshell!
Posted by: Chuck Hollis | August 04, 2008 at 01:55 PM
Hi Robert
I reviewed their announcement materials many times, and the implication was clear -- they positioned it as dedupe for SAN with products such as Symmetrix and CLARiiON, which are -- obviously -- SAN products.
Now, whether they actually intendeded people to take them at their word or not is debatable, or whether it'd actually work reasonably enough, and so on.
We shouldn't have to be second-guessing vendors' announcements to this degree. Struck me as immature, it did.
Posted by: Chuck Hollis | August 04, 2008 at 02:03 PM
I don't quite understand the fascination of LUN's on filesystems and vice-versa over here. I'm more of an empirical guy.
In my own personal expereinces, I've seen V-Series front-ending several back-end arrays with increased performance. It seems their optimized virtual block layout and integrated read/write cache makes the most out of disk-limited configurations by involving every spindle in all the aggregated I/O's tossed at it.
Add in thin provisioning, fast snaps and optional iSCSI or NAS gateway functionality and I can see the attraction. Heck, dedup is just gravy at this point.
Posted by: Jonathan | August 04, 2008 at 05:45 PM
The performance effect you're seeing basically results from spreading the load over available spindles, which works well in some situations.
This is also done by many volume managers (many free!), certain file systems, within the array itself, etc. -- nothing unique to the V-Series.
Reinforces my suspicions on how this product is marketed and sold ... thanks!
Posted by: Chuck Hollis | August 06, 2008 at 07:38 AM
I've been scanning the blogs at NetApp, hoping for some sort of answers to some of the questions posed here.
Nothing so far, really. Not surprising.
However, I did see this scathing missive from Val at NetApp, proving -- once again -- blogging is best done when sober.
http://blogs.netapp.com/exposed/2008/08/before-flattery.html
Val -- I know you're trying to help NetApp's cause here, but I don't think this sort of thing is what the corporate PR types are looking for ...
Posted by: Chuck Hollis | August 06, 2008 at 09:33 AM
Thanks for the cross-link Chuck!
I’m quite pleased this “scathing missive” didn't disappoint. When you swing the rhetorical pendulum that far, don’t be surprised when it swings away as far in search of facts so that it can obtain a natural balance.
As for corp PR folks, what did yours say when Curtis Preston had to recently correct your excessive hyperbole twice regarding your creative interpretation of EMC’s primary dedupe offerings? By EMC’s standards, I’m sure no ethical lines were crossed there either!
Finally, I guess you could say I subscribe to the Dr. Johnny Fever school of blogging productivity – better when pleasantly inebriated :)
Posted by: Val Bercovici | August 06, 2008 at 05:31 PM
I apologize to all my readers, it appears that "Jonathon" didn't really want to offer up a real email address, nor a real website address. I usually catch that sort of stuff before it makes it here ... again, my apologies.
Posted by: Chuck Hollis | August 07, 2008 at 12:19 AM
My, you meet all sorts of people on the internet, don't you?
Posted by: Chuck Hollis | August 07, 2008 at 12:26 AM
In some of my reading I came across this. Someone forgot to point out that IBM sells NetApp appliances, so I'm sure there is support there.
As for "virtualization" appliances, there are many, everyone I've seen introduces latency and a performance overhead, physics are at work here. However, many people are willing to take a performance hit for increased functionality or some other trade-off.
I cannot believe that Deduplication will be enough of a reason for making a V-Filer implementation on top on EMC or any other storage.
As for performance? Test it! Take the same FC LUN from a DMX and run a benchmark, take the same volume presented through a V-Filer and then presented out as a FC LUN and I'd love to know the difference. If anyone reads this and does it please report back!
Posted by: Steven Schwartz - The SAN Technologist | October 23, 2008 at 04:30 PM