Today IDC and EMC released their annual study on the size, shape and structure of the "digital universe": the total amount of information we're collectively generating, storing and using.
Titled "A Digital Universe Decade: Are You Ready?", it goes beyond the usual really-big-number type of forecasts to provoke serious discussion on a number of topics.
And if you're a big-picture type of person, you'll want to seriously contemplate some of these findings ...
To Begin With ...
This is the 4th annual study by IDC and EMC. Each year, the numbers get bigger. Each year, the implications get more interesting. Each year, more people get intrigued by what's going on here.
There's a nice deck that was created as part of this. I'll put the thumbnails here along with my thoughts, and offer you a link to both the PPT and the associated IDC commentary at the end of this post.
Might be useful to use a few of these slides the next time you're going for budget approval :-)
The "ticker" is a useful device to put in front of people during a presentation -- after a few seconds of watching the numbers spin frantically, people usually get the message.
What I found interesting is that -- even in a year where the financial economy struggled -- the information economy never really slowed down.
In a year of declining GDPs, we generated an additional 62% of information to add to our existing treasure trove.
So, if that's what happens in a bad economic year, what happens in a good economic year?
And people wonder why I decided to work for a storage company way back when :-)
A zettabyte is a trillion gigabytes.
The press materials try to translate this number into something people can easily comprehend, but it's a hard task.
One comparison was "imagine 75 billion 16 GB iPads". I don't know about you, but that's hard to imagine as well.
Note the "44x" growth factor between now and 2020. For those of you who are thinking "well, that's really far in the future", maybe it is, and maybe it isn't.
I can remember clearly what I was doing in 2000 and 2001. It's now 2010. Not hard to see how quickly 2020 will be upon us.
IDC predicts that more than a third of all information that's created, copied and used will either be stored in the cloud, or pass through the cloud in some form.
When I share this finding with many people, most see it as conservative. I guess it depends on your definition of cloud, doesn't?
But the conclusion is clear -- there's going to be a lot of information that doesn't live in traditional locations going forward.
We'll come back to the cloud angle in a bit -- because one could make a case that several other forces may cause this to happen sooner than later.
I found this slide perhaps the most interesting one.
In the middle, we've got the blue line -- showing 44x growth. Obviously, it won't be as linear as show here, but you get the idea.
Now, look at the green line. That's the growth in the number of "information containers" -- files, objects, messages, etc.
That's predicted to grow 67x. All sorts of implications result from this.
First, we're starting to generate shorter "information packets" -- think tweets, smartgrid data, GPS coordinates, RFID messages and the like. We'll still undoubtedly have plenty of the big stuff around: video and other digital signals -- but the 67x growth in information objects points to the need for newer ways of organizing, managing, protecting, finding, sharing -- all these information objects.
These numbers will be measured in the quintillions (billions, trillions, quadrillions, etc.). Now, consider how we're doing this today: file systems, databases, etc. See the challenge?
Now, if you really want to have fun, go look at the bottom red line. That's the forecast that we'll only have 1.4x as many IT professionals available over this period of time. The implication of the study is that this will be supply constrained, rather than demand constrained.
If you're a career IT professional, you'll either see this as a terrible crisis, or a wonderful opportunity :-)
In my mind, this imbalance will likely create strong demand for specialized external service providers (think "cloud") that will perform information management services on behalf of other organizations.
This chart tries to illustrate the problem. Years ago, we used to think in terms of "user generated" information and "enterprise generated" information as distinct entities with very little overlap.
My, how the picture has changed.
Enterprise-generated information is becoming much less important. User-generated information is starting to dominate.
Very often, that information is handed over to an enterprise (e.g. banks, hospitals, etc.) to store and manage on the individuals' behalf -- and with very clear guidelines around accountability, ownership, etc.
But look at the center overlap -- the aqua blue. That's the growing area where it's not really clear who owns what -- it it the user, or the enterprise? And if you're increasingly uncomfortable with Google or Facebook handling your personal information, I'd offer the "overlap" is where we're going to see growing controversy.
Who owns this content -- me, or my company? Who's responsible for this content -- me, or my company?
If I should be so lucky to come up with some really cool intellectual property on this blog, is it mine, or my company's?
I don't think that *anyone* has clear answers to any of this -- but social media has unleashed an "overlap content" information beast that puts all sorts of interesting questions on the table.
And that's just the tip of the iceberg -- isn't it?
Not only 67x times more information, but close to half of it will come with some sort of responsibility, up from approximately 30% today.
One interpretation is information is increasing in value, hence that value will need to be guarded in some form.
This also can be seen as a forcing function on how we think about this topic -- newer forms of information governance, risk assessment, measuring compliance, etc. -- not to mention the need for new technological approaches.
And if you're involved in some aspect of information security today, your prospects seem bright in terms of demand for your skills :-)
If information is valuable, you don't want to lose it -- just like you don't want to lose money.
If you take the forecasts at face value, roughly half the digital universe will be inadequately protected against loss.
Another reason to start thinking in terms of information governance, classifying information, etc. -- that is, unless you're happy with the prospect of having a growing number of really bad days -- or spending an inordinate amount of money.
And now for something stunningly obvious, once you think about it ...
The blue curve estimates a rough cost-per-gigabyte number over the next ten years.
Good news, right? -- the cost of storing information is coming down.
Until you consider the amount of information being created.
Make things cheaper -- people tend to use more of it. If you build more highways, you'll end up with more traffic.
In fact, one could argue that the rapidly declining costs of storage, network, compute, etc. is fundamentally enabling the information explosion.
A sobering thought for those of you obsessed with driving costs out of IT. When IT is cheaper, people will end up using more of it, often with positive elasticity.
There's a lot to consider and debate here. Whether you agree or disagree with the study - -its purpose, methodology, findings, conclusions, etc -- there is no arguing that something is happening, and it's happening very quickly.
Have fun :-)