And there’s a lot to talk about as a result …
What's This All About?
In a nutshell: big data driving a new generation of data computing applications using a private cloud model. Let's take these one at a time.
You've probably heard the term "big data" before: it refers to analytical applications that depend on billions of facts and hundreds of terabytes -- more often petabytes -- of data.
To get meaningful value from any mountain of data, you need not only massive storage, but massive compute and memory as well. Hence the moniker "data computing applications" -- notably different than traditional HPC (high performance computing) or technical computing environments.
These "big data" applications will likely want an environment that's build on dynamic and virtualized pools of compute, memory and storage. They also need tools that a create a self-service environment for power users.
Simply put, data computing is a great use case for a private cloud.
The EMC / Greenplum Story
EMC and Greenplum have been working together with shared customers for quite some time. We know them well, they know us and our technology.
Greenplum brings two key things to the table -- a new architectural model as well as a new consumption model.
Their architectural model may be familiar: a x86-based, scale-out MPP, shared-nothing design that not only delivers 10x-100x the price/performance of traditional approaches, but is also is well-placed to ride the current wave of mainstream technology evolution.
More on the underlying trends in my post from a while back; see "The Coming Revolution In Business Analytics".
Off-the-shelf, the Greenplum approach appears to be a prescient fit for newer, pre-integrated virtualized environments like Vblocks.
Down the road, there's the tantalizing potential to move data intensive operations closer to the storage (remember, they're all x86 these days) and compute intensive functions to server farm.
What Happens Next?
All integration activities have to wait for approvals and closing, but we're making some pretty clear statements about how we plan to integrate Greenplum.
This new group will be led by Bill Cook (CEO of Greenplum) and report directly to Pat Gelsinger, who leads EMC's Information Infrastructure business.
This was largely the same approach we used when we acquired Documentum (now known as IIG), RSA (now the information security division at EMC) and more recently DataDomain (now the core of EMC's Backup and Recovery Solutions group).
The Greenplum field force will become the nucleus of a new, dedicated specialty sales force, in much the same way that we've done for information intelligence, information security and backup/recovery. Existing Greenplum customers will be largely working with the same folks; newer EMC/Greenplum customers get access to unmatched expertise in this area.
The Greenplum use case is an obvious place where EMC can add value through integration -- not only by creating EMC Proven Solution reference architectures to speed deployments and maximize value, but progressive integration of the rest of EMC's portfolio.
Many Targets For Value-Added Integration
If you think about it, we've got a very long list of candidate technologies that might be interesting to consider integrating in the future:
- All of EMC's storage products are x86 based -- this creates a potential pathway where data intensive functions could be run closer to the information, freeing the compute farm to do what it does best.
- Enormous data warehouses also need to be backed up, archived and otherwise protected -- although the concerns and priorities are usually somewhat different than traditional OLTP applications.
- The vast majority of these data warehouses contain sensitive information and produce analysis that is either confidential or otherwise privileged. Think information security and data loss prevention, for example.
- Much of the higher-order analysis produces rich content that frequently drives a collaborative workflow among knowledge workers. Think about EMC's assets in content management, collaborative workflows and case management.
- And, finally, let's not forget the seductive appeal of running on-demand business analytics as yet another fully virtualized workload use dynamic resources in a private cloud model. Like running on a good-sized Vblock, for example.
As we've tested this acquisition with customers and partners, some of the same questions tend to come up. I thought I'd get ahead of the curve, and share some of the more common Q+A here.
The first general set of questions usually comes up around this being some sort of competitive response to Oracle. People see that Oracle is building a closed stack around data management (Exadata), and want to know if this is intended to be a direct competitor.
Well, it isn't.
Greenplum is not a general purpose database like Oracle is -- it's specifically optimized for large-scale data warehouse and business analytics using a legacy-free approach.
Not everyone needs an advanced platform like Greenplum to gain a competitive advantage -- but more and more customers do. And, in these environments, the Greenplum environment complements the investments they've already made in fact-gathering transactional applications and databases.
The second general set of questions usually comes up around EMC's other partners and alliances in this space: names like Sybase and SAP and Microsoft and ParAccell and a host of others.
Those relationships don't change.
No one technology can meet the entire market's needs -- which is why EMC partners deeply with many ostensibly competing technology vendors. One similar example is hypervisors -- although we're deeply invested with VMware, we've made considerable investments in Microsoft's Hyper-V as well as the Citrix/Xen/KVM offerings as well.
And there are many more examples like this to point to if needed -- business as usual.
The third general set of questions has to do with EMC's overall expertise in database and data warehousing environments. Do we have enough smart people to bring to the table?
I think this set of questions has more to do with the incorrect perception that EMC is somehow getting into the general-purpose database and data mart business. That's certainly not the case. Greenplum helps create very specialized business analytics platforms that operate at considerable scale. This isn't a mass-market IT opportunity.
Framed that way, the answer is "yes, we do".
Finally, there will be the inevitable discussions around "lock in", that somehow, customers wanting to use one part of the EMC portfolio will be "forced" into using another. This hasn't been true in the past, and will unlikely ever change in the future.
Take an arbitrary EMC product. Go look at all the different products and use cases it supports -- some from EMC, some from our competitors. You'll notice a consistent picture of "customer chooses".
Some people will look at this transaction, and wonder what the heck we might be up to.
Others will look at our long-standing corporate tag line, and more fully appreciate the rationale.
EMC: Where Information Lives