An interesting serendipity. EMC goes long on Big Data at EMC World.
And the McKinsey Global Institute releases a stupendous landmark tome entitled "Big Data: The Next Frontier For Innovation, Competition and Productivity". For free - no reg required.
This document is making the round at warp speed in my circles. At least ten people have sent me a copy, usually raving about it. I read it at length (warning: it's long), and I have now also turned into an ardent fan.
If you care anything at all about Big Data (or might be just a bit curious), or -- as a leader, you care about "innovation, competition and productivity" -- make the time to go read this paper in its entirety. Not the short summary, the full-boat longer version.
It uniquely captures the breadth and depth of the opportunity in front of us collectively. It does a yeoman's job of quantifying the economic benefits in five interesting use cases. It offers pragmatic advice for both the organizational leader as well as policy makers everywhere. And it has a useful and easy-to-understand glossary of big data terms and concepts.
If I could make it required reading to pass the course, I would.
Please, go make a few hours in your busy life and go read this PDF (it's at the bottom of the page).
(disclosure -- a lot of people on Twiiter wondered if EMC had someone sponsored this work. Fair question. The timing, the talk-track -- I mean, it's almost eerie. But, alas, that's not the case to the best of my knowledge.)
It's that good.
My personal congratulations to the McKinsey team for a landmark intellectual contribution on the topic. For the ability to express compelling technology in business terms that anyone should be able to understand.
And, especially, for the wisdom to make it freely available to all.
More could learn from your stellar example :)

As and infrequent but interested reader of your blog, this is an important study. But I cannot help but feel this study sort of begs the question; where are the enterprise information models which can make use of the big data explosion in each industry? While the big data explosion is significant, the ability of an enterprise to turn this into information is critical. After all, information equals high quality data + meta data + data context. While this study does go into some detail on some of challenges to building these models -- expertise, data quality, intellectual property, technology and investment -- it does not explain how imperative it is to build these enterprise information models. EMC, as a leader in this field, needs to focus on this challenge as well as the areas of data interoperability, analytics and contextual models from mobile devices. In this way, EMC could make good on the long held "where information lives" tagline.
Posted by: Dave Hopkins | May 18, 2011 at 12:49 AM
“ We are drowning in information,but starved for knowledge.” John Naisbitt
In 2010, businesses and individuals created 1.2 zettabytes of data. The volume of data is growing incrementally year after year. Yet, in many cases, this amount of data brings little or no value to the business: data are produced, stored and managed but a lot of these data remain unused in the decision process in enterprises.
Business too often fail to put these data to work in their decision process. Managers prefer to manage by gut feeling. Yet today’s data are the basis for decisions about tomorrow.
Companies are now able to make decisions based on real-time data, while at the same time keeping these decisions in line with the long-term strategy.
But it’s not only about technology. Data management is a management philosophy. Companies need to make data analysis part of the corporate culture and data analysis needs to reflect the corporate culture.
Big data is one of the big terms in a CIO’s life these days.
Posted by: Philippe Gosseye | May 18, 2011 at 02:43 AM
Hi Dave -- thanks for the comment.
Your comments around the necessity of enterprise information models, rich and consistent metadata, well-established processes, etc. are all spot on.
However, there seems to be sort of gold rush going on.
The people I meet are highly motivated with their data: the more of it, the more diverse sources, the more external vs. internal, the "dirtier" it is -- the better.
It's hard to make the case that all of that should come to a screeching halt while the next-gen enterprise information models, etc. are fully considered.
My guess is that experimentation will lead to insights which then eventually become codified and then standardized. The innovative data scientists by then will have lost interest and dutifully exploring even more data sources.
Thanks again
-- Chuck
Posted by: Chuck Hollis | May 18, 2011 at 09:11 AM
If my comment suggested that the big data explosion in an enterprise needed to "come to screeching halt" in order to build the information models necessary to use this information, then I apologize. I was simply trying to point out that the information models are needed that can pull together diverse sets of data to create more valuable information for the end user. Many businesses derive quite a bit of valuable information from doing this on a business unit basis. This is where traditional business intelligence applications have been effective. With the big data explosion and the correct models and tools, the opportunity to do enterprise intelligence for the end user can provide companies with another leap in productivity.
Posted by: Dave Hopkins | May 18, 2011 at 11:20 AM
Dave
I think we're basically saying the same thing.
The front end of the activity ought to be mostly about experimentation and innovation.
The middle of the activity ought to be around standardization and formalization.
And the back-end of the activity ought to be the promotion of the results (and new ways of thinking) to more broadly empower more of the organization.
Did I capture that right?
-- Chuck
Posted by: Chuck Hollis | May 18, 2011 at 01:58 PM
Chuck, I believe this is correct but let me add a little more color to this a bit. The front-end activity being focused on model creation and testing of those models. The middle of the activity focused on model validation which includes data governance and data cleansing. The back-end activity is focused around optimizing the model flow and promotion of the models to users in the field. The exciting thing of all of this is that the next generation of platforms (VCE, Greenplum Cloud Foundry, SourceOne, etc.) are making each of these these steps possible.
Posted by: Dave Hopkins | May 18, 2011 at 08:13 PM