The recruiter said that EMC was expanding from their mainframe storage market into UNIX ( he actually pronounced it "ooo-nix") and they were looking for people who knew the landscape.
After I got done laughing, I helped him out a bit :)
We went back and forth for a while on the phone, but he said one thing that still sticks with me to this day. "The reason you want to work for a storage vendor", he said, "is one simple reason: nobody ever wants to delete their information."
I thought about that for a moment, and realized that -- compared to other IT infrastructure disciplines like compute, software and networking -- he had a very valid point.
That was true back in 1994, and it's certainly true today.
While the message hasn't changed much, the motivations certainly have …
Before We Begin
Occasionally she'll land on "Storage Wars", a reality-type show about people who make a living bidding on abandoned storage units (largely sight unseen), and then hoping to profit from the contents.
Sometimes it's just junk in that storage unit. Sometimes it's incredibly valuable. You just don't know until you go looking through it in detail -- and that's the plot driver for the show.
In the digital world, it's the stuff that business people want to store that lands up in IT's "storage units". Application data. Archives. File shares. Projects that were abandoned.
You name it, and it ends up in a storage array somewhere -- taking up space and resources.
Is it valuable? Is it trash? Hard to tell until you go looking ...
Back To Our IT World
IT groups have been complaining about having to store everyone's digital junk for as long as I've been in this business. It's understandable: stuff is thrown over the wall, and there's rarely enough money in the budget to pay for it all.
Not only that, it rubs IT people the wrong way: they strongly suspect that much of what they've been asked to store is probably of little value, and it just irks them that money is being wasted on a low-value pursuit when there are more important pursuits at hand.
Over the years, I've rhetorically asked the question "why don't you just delete some of it?" just to see how they react. And the answers have noticeably shifted over the years.
The Early Years -- It's Too Hard
I'm going back a long ways here, back when enterprise-class storage arrays were painfully expensive. Like a dollar-per-usable-megabyte expensive. No, I'm not making that up.
So you'd think there would be a clear motivation to clean the closet and get rid of the junk no one was using. But many people told me that was an impossible mission.
First, it was very hard for some person buried in IT to go track down the provisional "owner" of the information. Even if you did, the person who generated the information might not be the only consumer of it, so more sleuthing was required.
Finally, you'd have to get everyone to agree that -- yes -- this data could be deleted, or at least moved to write-once-read-never (WORN) media like tape.
Or give up trying.
Add up all the time and effort required, and many quickly realized that this was a fool's errand, and -- yes -- maybe it might be more cost-effective to just suck it up and buy more capacity.
All of us storage vendors were not displeased by this state of affairs.
A few brave souls experimented with auto-delete (or auto-archive) functions, but every application and user group was different -- it was very hard to implement a standardized function this unless you happened to be on a mainframe or something similar.
Get your policy wrong, and you'll end up with a rebellion on your hands.
EMC and others seriously chased this idea many years ago (ILM) but couldn't make meaningful progress except for isolated use cases such as content management, email or e-discovery.
The Mid Years -- We're Being Compliant
The advent of Sarbanes-Oxley and similar information compliance regulations created a minor boomlet in the storage business. All of the sudden, we had CFOs and legal departments insisting that large amounts of data be kept around for a very long time; often on a distinct storage platform with stringent access controls.
Some organizations did a good job defining and implementing policies. Others simply erred on the side of caution, and tended to keep anything and everything that might potentially be considered a 'business record' in a compliance context.
Including cafeteria menus, according to one customer.
Some organizations saw this as a rare opportunity in disguise, and jumped at it.
If data didn't need to be kept around for compliance purposes, it should therefore be deleted (not archived) at its earliest opportunity: old emails, files that hadn't been accessed for a while, etc. Keeping stuff around that wasn't required was seen as a potential business risk by some.
And there were more than a few IT groups who were gleefully delighted at the prospect of deleting terabytes of low-value data, and being able to say with a straight face "the CFO told us to do it" or similar.
I think most organizations have a good handle on that issue now, with a lot less craziness than we saw years ago.
The New Era -- It Just Might Be Valuable
So, let me give you an insight into what happens when your business gets the big data analytics bug. This is not theory -- I'm seeing a version of this right now inside of EMC.
Business process owners start experimenting with big data analytics. More diverse data == better predictive models, so the hunt is on for different internal data sources.
The more advanced practitioners end up crawling through enterprise cyberspace, searching for repositories of neglected data that just might be potentially useful in their new context.
All of the sudden, stuff that you thought was junk can be quite valuable. Or not, depending on the situation. You just don't know until you go looking.
Old emails. Years of status reports. Powerpoints. Log files and service tickets. Video. Etc. etc.
Here's the rub: once that data is deleted (or moved to a near-impossible-to-access media), it's essentially out of the picture. Any potential value is gone. And since you've got multiple groups searching at multiple points in time, the IT team is put in a difficult position.
If IT decides to delete this information (or deep-archive it), they are unilaterally making a business judgment call on the potential future value of that information. To make matters more interesting, there's no way to easily predict the future value of a given data set, nor is there usually any funding available for this very speculative view of future potential value.
So what to do? No clear or simple answer I'm aware of.
But I do see far less aggressiveness on the part of IT groups to ruthlessly delete information. I think there's a growing awareness that IT is ultimately the custodian (and guardian!) of an organization's information wealth.
And -- even if they're not funded -- IT pros usually feel obligated to use good judgment and their own personal expertise to make wise choices.
Even if no one wants to delete anything -- ever.
Like this post? Why not subscribe via email?