Yes, I admit it. I am caught hook, line and sinker on the transformative potential of big data analytics to not only change business, but change the world we live in. It grows more heady and more intoxicating with every passing week.
Red pill, blue pill ...
But there's another important aspect to the big data discussion that shouldn't be ignored -- and that's the plethora of big data models that aren't entirely based on advanced analytics, but are still strategically important big data applications in their own right.
Our cohorts at Isilon have done a masterful job at accelerating value for our customers in this space, which occasionally tends to get overshadowed by the more flashy topics.
So, in the interests of inclusiveness, I'd thought I'd share with you the other side of the big data story.
Big Data, Big Picture
Step back far enough, and almost all big data models start to look awfully similar.
You always have a small cohort of highly-valued knowledge workers who are creating the core value that powers the proposition. You always have a rather specialized set of tools and associated workflows around these experts. And you always have a lot of data, with more arriving daily.
The "big" part of "big data" deconstructs into two stark realities.
First, the more data is available (and the faster it is served up), the better the model works -- up to the point where the knowledge workers who are consuming it can't really consume any more.
Hint: that hasn't happened yet :)
Second, because of the statement above, you're always going to be pushing beyond the limits of general-purpose IT technology, and inevitably develop a strong interest in architectures that align with the needs at hand.
If your proposition is creating predictive models using all available data, there's a branch in the discussion that leads to data scientists and the environments they need to work their magic.
But if your proposition involves creating other "information products" around highly-specialized knowledge workers, the discussion branches into more familiar forms of expertise, and the technologies they need to work their unique magic.
Examples Are Everywhere
This thought was triggered by a recent press release announcing that Jaguar Land Rover has elected to standardize on Isilon's scale-out NAS technology to support its critical simulation operations, as well as CAE (computer aided engineering) and HPC (most likely used to support the simulations).
As everyone knows, the car business is incredibly competitive. Lots of good companies, lots of good products. The race to build a better car -- and to do so faster, better, etc. -- means that car companies have a strong incentive to invest in technology environments that maximize the productivity of their engineers. One aspect of the resulting approach is the need to support very large datasets, make them easy to consume for the engineers, and deliver them at high performance to support the simulation activities.
Simple business logic results in a strong demand for an Isilon-style storage solution.
Or consider this seemingly unrelated announcement that the Associated Press (AP) has standardized on Isilon for their high definition production requirements. Yes, the media business requires lots of video content, but the "magic" is in the editors and producers who can take zillions of hours of raw video, and end up with compelling content for us all to watch.
Again, a small cohort of highly-prized specialists who weave their magic, augmented by enormous amounts of fast storage and processing power.
Yet another example from Ambry Genetics, whose laser focus on gene sequencing creates the same sort of model: a small team of experts empowered by great gobs of technology.
And many, many more examples when you go looking for them.
Here's The Point
When considering big data analytics, it's a relatively easy mental exercise to go vertical-by-vertical and come up with a half-dozen extremely compelling use cases that could be enabled by data science, data scientists and supporting infrastructure.
It's relatively new, and it's certainly extremely cool.
But the generic model of using big data -- it its broader form -- to support advanced knowledge workers is nothing new. Energy exploration. Weather modeling. Engineering simulations. Media and entertainment. Drug research.
No, these aren't the sexy web-scale companies that are the darling of the media; but they are pretty cool nonetheless when you get to understand what they do.
This much broader model is just about everywhere once you go looking -- and has been for a while.
In one sense, the notion of creating incredible value from smart people working with enormous amounts of data is not really that new.
The only thing that's new is the much broader applicability.
Convergence For Big Data Storage?
Taken from the point of view of the technology architect, these are interesting times -- especially when considering storage, or -- more accurately -- storage services that support the business.
You may end up having to support a variety of "big data" models in your enterprise.
Over here, maybe some engineers or other experts running simulations. Over there, maybe a new batch of data scientists creating predictive analytical models. Not to mention the day-to-day mountains of more ordinary knowledge worker content that ends up on filesystems -- a more prosaic form of "big data", not to be ignored.
One of the things that I'm starting to really appreciate about the Isilon proposition is that it does a superior job of meeting all three requirements using a simple, scale-out architecture and an extremely simple management and operational model.
Simplicity is good, especially at scale :)
Strong credentials in industry-specific use cases involving more traditional big data / big content models. Brand new credentials in the world of data analytics, especially with HDFS support for Greenplum UAP and Hadoop use cases. Growing credentials in the general-purpose enterprise content space as well.
All on one shared, optimized technology platform. Certainly worth considering.
I've made the connection -- at least in my head.
I wonder how long it will be before our customers see the same?

nice article Chuck. Stumbled into this searching for use cases for Data Science in enterprises. What are some of the use cases in an enterprise like EMC, or any other hi tech companies for using this?
Posted by: Sam Poozhikala | February 24, 2012 at 12:11 AM
The power of being able to process large amounts of complex data efficiently is critical. I recall so many times working on many CAE projects, in the past, having to run far fewer simulations, and other processes, than really warranted & desired simply due to the inefficiencies involved. In cases that related to medical devices this was always concerning.
Such advances today with Big Data are a true marvel. Thanks for this eye-opening article.
Posted by: Ken Carroll | February 28, 2012 at 07:54 AM
Hi Chuck. Great to see you writing about this. We are all in - hook, line, sinker and big fish for Big Social Data too. Hope you've registered EMC for our Social Business Index? It will blow your mind. www.socialbusinessindex.com
Coming to SXSW?
Posted by: twitter.com/ITSinsider | March 02, 2012 at 04:38 PM
Again, a small cohort of highly-prized professionals who incorporate their miracle, enhanced by huge of fast storage space and handling power.
Posted by: חברות השמה | April 05, 2012 at 11:05 PM