One of the more memorable sessions at this year's EMC World was when Paul Maritz took the stage right after Joe Tucci to walk through the rationale and motivations behind Pivotal.
I'm sure more than a few people walked into the keynote session wondering what this Pivotal thing was all about. Not only did Paul do a masterful job explaining the concepts, he made the compelling case that a platform like Pivotal must exist.
The entire video appears here; Paul starts at 32:00 or so. Very much worth watching in its entirety if you have the inclination.
For everyone else, I thought I'd attempt to summarize some of the core ideas in an easy-to-digest form -- although Paul is a hard act to follow.
Pivotal is a new business entity within the EMC federation; formed from assets owned by EMC and VMware, and hits the ground running with ~1300 employees, several successful predecessor products in the marketplace and a long list of enthusiastic customers.
Pivotal's mission is simple: to create a new kind of platform for a new kind of world -- the world of big data.
Let's Begin With The Three Platforms ...
You're starting to hear frequent mention of "the third platform". It’s turning out to be a useful shorthand to describe the sum total of recent IT developments – and the best way to explain is by recapping the previous two platforms.
The first platform was essentially mainframe computing: highly centralized, tightly managed and controlled, etc. And there's still a lot of that out there.
The second platform is more familiar -- it's client-server computing in all its forms. Enter distributed systems, smart personal devices and all the rest.
The third platform is described here as "consumer grade" -- all those rich application experiences we've come to know and love on the internet – and everything beneath the tip of that iceberg.
Historically speaking, the first platform was mostly targeted at financial and transactional applications -- just one of the historical reasons why IT so often reports to finance.
The second platform was mostly about automating legacy paper-based processes: ERP, CRM, email, personal productivity, workflows, etc.
The third phase is about entirely new experiences for users, and entirely new business models build around data itself. These consumer-grade experiences were pioneered by the familiar internet giants; they now are well on their way to becoming de-facto capabilities for the enterprise as well.
Paul makes an explicit point here: both the data layer and the infrastructure layer behind this consumer-grade model are evolving as well.
Hardware has evolved from mainframe to servers and PCs to cloud -- the new infrastructure. Paul emphasizes that "cloud is more about what you do than where you do it". (+1)
The data fabrics are quickly evolving as well -- from familiar ISAM through relational databases to the newer data fabrics that rely exclusively on scale-out architectures.
Learning From The Pioneers
Paul backs up a bit, and shares that -- when Google wanted to index the entire internet -- a second-platform approach just wouldn't cut it -- not feasible either technologically nor economically.
They -- and others like them -- had to basically invent new architectures for the task at hand.
He says that there are three key capabilities that all of them needed, and will be needed by most enterprises in the future.
The first is simple: the need to store, analyze and reason over an extremely large amount of data (multi-petabyte scale) and to do so cost-effectively. The second was the ability to create new application experiences very quickly.
Paul shares the familiar Facebook story: if you're a new developer, you're encouraged to put your first application into production on your first day of work. Eye-rolling aside, the anecdote emphasizes the speed and agility demanded in newer consumer-grade IT platform models.
Finally, their consumer-grade user experiences demanded a highly automated cloud that operates at considerable scale. And every successful internet giant has learned to do those three things quite well.
Paul makes a key point: enterprise IT is going to have to learn to do some of the same things that the internet pioneers have learned how to do – and they’re going to need a platform to do it with.
Here Comes The Internet Of Things
It's not enough to simply recreate the capabilities of consumer-grade internet companies; there's an entirely new wave at hand -- the internet of things: hundreds of billions of connected, autonomous devices, each producing and consuming information prodigiously.
He shares the familiar example from GE: a single transatlantic flight produces ~30TB of engine data that demands to be stored, analyzed and mashed up against many other data sets.
Paul describes this as a "two order of magnitude" problem: 100 times more devices, 100 times more data than previously envisioned. Big numbers indeed; and even more justification to move beyond the familiar second platform.
But there are more challenges at hand: the ability to ingest, react and make decisions based on these data streams in near-realtime. For example, if you're familiar with how real-time ad brokering works on the web, that's just a small taste of what's to come.
The Legacy Doesn't Ever Go Away
One challenge that is unique to enterprise IT is the vast landscape of existing first and second platform applications that have to interact with these newer "third platform" entities: mostly as important data sources, but occasionally as recipients of decisions or transactions made externally.
While it’s nice to think about a “greenfield” environment for data and applications, that’s turning out to be the exception vs. the rule, especially in enterprise settings. Those applications have to be brought forward – as part of the software-defined data center – to run alongside the newer cloud-scale analytical applications.
Cloud Is The New Infrastructure
Twenty years ago, the way we achieved portability was to write to POSIX interfaces. The promise was simple: the way you got to a new processor architecture or operating system variant was to simply recompile your code – lock-in avoided.
In today’s cloud world, there’s a strong motivation for application platforms that are cloud-agnostic: run on your own private cloud, or any number of external cloud services. Paul makes it clear: not everyone will have the justification to go build their own clouds – but everyone will want to be able to choose and intermix freely between the services that are available.
Tying It All Together
The mission is now becoming clear: how do you create a platform that ties together these five core capabilities, and enables enterprises to create the “consumer-grade” internet experiences going forward?
That’s the motivation behind Pivotal One.
Put this way, the logic behind Pivotal is almost inescapable.
The New Platform: Pivotal One
If enterprises are going to create a new generation of cloud-agnostic, analytically-powered applications, they’re going to need a platform to do it on – and that’s the motivation behind Pivotal One.
Starting with key technology contributions from both EMC and VMware, Pivotal now has about 1300 employees, “beavering away” at putting the finishing touches on their first platform.
Paul described the approach as “strongly anchored in open source” – use enterprise-grade distributions of open source functionality when and where appropriate, or layer in more proprietary components on an as-needed basis.
The slide describes it as a “data-centric” platform, but that verbiage misses what’s unique and special about creating a new generation of applications using big (and fast) data.
Multi-cloud is a given going forward for obvious reasons.
Pivotal One is targeting the needs of two key constituencies – developers and the enterprise. Developers (of course) need the ultimate in flexibility and functionality – but the enterprise needs to be able to run their business on the result.
Paul describes Pivotal One in three pieces. First, a cloud fabric that abstracts resources from the underlying cloud provider and operational model. Second, a data fabric that can handle both big data as well as fast data. And finally, an application fabric that enables great productivity and re-usability.
Diving Into The Data Fabric
Paul states unequivocally that the move to the third platform is changing how we think about ingesting, transacting and querying data at massive scale.
The new design pattern? It’s scale-out objects, best represented in the form of Hadoop and HDFS.
The one thing we’ve learned about data, according to Paul, is that we get the best value from it when it’s all stored in one place, and not balkanized across different systems, geographies, applications, tool chains, etc.
More customers are giving careful thought to where they’re going to land as much of their data as possible for this reason: you’ll hear people talking about “data lakes” or “data pools” or “data landing zones” along those lines.
And those data landing zones are going to get very large indeed, not only with the familiar corporate data, but the coming wave of machine-generated information.
These new repositories will need three key attributes. First, of course, they’ll need to be able to get very, very large – hundreds of petabytes, potentially. Secondly, they will have to be extremely cost-effective. And third, it will have to be highly reliable.
Indeed, it’s this need for a data substrate that’s driving the focus on HDFS – not just for Hadoop, but for many other tasks in this new world.
But to provide the necessary semantic processing, it’s not enough to have scale-out storage – we need scale-out in-memory data fabric technology as well. Paul makes the point that it’s not enough to have a single computing instance with a large in-memory database – we’ll need hundreds, or possibly thousands to do the work at hand.
Part of the motivation is the clouds themselves – they’re making it very easy to simply request many hundreds of memory spaces (and their compute instances) to “reason over” very large amounts of information.
That horsepower will be needed over the entire lifecycle of information: ingesting massive amounts of information in real-time, scrubbing and filtering it, analyzing it for key insights and patterns, and then ultimately driving transactions based on its decisions.
It’s a new style of application – one that requires a new approach.
Paul points to two asset examples, one from VMware, one from EMC. Greenplum, for example, has over ten years of in-memory, scale-out parallel query expertise that they’ve now brought to the Hadoop environment. And GemFire brings in-memory transactional expertise, cutting their teeth (among other places) in Wall Street automated trading platforms – some of the most demanding applications on the planet.
Paul then points to the first release of technology in support of the Pivotal One platform – Pivotal HD. In a nutshell, a key component (HAWQ) enables massively parallel querying of HDFS data using the familiar and standardized SQL – essentially bridging two worlds.
Paul also put up an interesting competitive slide, showing the expertise that Greenplum has brought to the table, by comparing SQL queries against other alternatives that are attempting to do the same thing. Not only is HAWQ often order-of-magnitude faster, but it implements the full SQL standard and not a convenient subset.
The Application Fabric
The centerpiece here is the ubiquitous Spring framework for which VMware has acted as caretaker for several years.
Not only does this bring approximately 1 million enterprise Java developers into this world, it also brings a very rich library of community-developed connectors and adaptors back into legacy “second platform” environments, helping to source data into the third platform.
Pivotal is extending Spring in several areas, the most important of which is the addition of new, higher-level analytical services that can be directly called by application developers.
The Cloud Fabric
Sitting underneath all of this is the cloud fabric, which Paul eloquently describes as “the operating system for the cloud age”. The goal is to abstract the underlying cloud implementation so that cloud choice becomes “a deployment decision instead of an architectural decision”: private cloud, vCHS, Amazon, OpenStack, or others.
Pivotal One’s Cloud Fabric also adds a standardized service registry to the mix, as well as lifecycle management tools for complex application stacks.
Paul tells a fun story about how the team building this layer was also responsible for the Borg project at Google. The one thing that the team was adamant about is that – if you’re a cloud service – you can never, ever go down for any reason. Updates have to happen in place, etc. Their philosophy was “eliminate the human” and ruthlessly automate for any conceivable scenario.
Paul also clearly stated that this layer needed to be extremely open – not only its interfaces, but its accessibility in source code form to the broader industry and community.
This layer, of course, is an evolution of the familiar CloudFoundry PaaS environment – not only open, but open source as well (Apache 2 license).
Bringing The Pieces Together
Paul unequivocally stated that the first release of Pivotal One would be available in Q4 2013 – not that far away in the bigger scheme of things.
But he also pointed out that people didn’t have to wait if they wanted to get started – as many of the components are already available today: Pivotal HD from Greenplum, GemFire, Spring, CloudFoundry, Pivotal Labs’ methodologies, etc. He also was clear that – while it is intended for the components behind Pivotal One to integrate well – each component should be able to create value and interact with other layers freely.
Paul also spent some time describing what Pivotal Labs does – “ a group of 250 extreme developers” as he puts it. He shares the story that for quite some time, they were the only outside development firm that Google chose to work with.
Their model is quite unique: “extreme programming” – if you want them to build an application on your behalf, you have to commit to sending someone onsite for an extended period, and they build an extended development team around the client with domain expertise.
Paul shares the view that – in the mobile world – enterprises are starting to think more like venture capitalists. Mobile apps are directly visible by your customers. You want the job done right, and done quickly – as opposed to looking for the cheapest way to get something done. And that’s where Pivotal Labs earned their stripes.
Clients don’t only engage with Pivotal Labs to get a great app, they also want to see what modern mobile app development looks like in 2013 in Silicon Valley. Part of the mission is skills transfer so Pivotal Lab’s clients can do this for themselves going forward.
Part of The EMC Family
Pivotal can also be thought of as the third member of the EMC federation: classic EMC, VMware and now Pivotal – each with the freedom and flexibility to address their customers’ needs as required. EMC is the majority owner, 30% is owned by VMware.
Paul ended on the rationale behind GE’s interesting equity investment in Pivotal. GE is a company more known for their proficiency in “heavy metal” (industrial technologies) – why would they take a big stake in a software platform company – and an entirely new one at that?
The reason is simple: they see the opportunity to capitalize on the transformation happening in the industry at large: sensors everywhere, powerful analytics delivering deep insight, and the potential to transform not only GE’s business model, but those of their customers as well. Conversely, they also see the potential for being negatively disrupted if the world moves forward and GE clings to its familiar business model.
They needed a platform to do this – and the investment in Pivotal makes logical sense when viewed this way.
I think Paul did an excellent job explaining the rationale behind Pivotal – and why the world needs a new platform for a new era.
There is a wealth of powerful ideas and supporting technology that’s been quickly brought together into an operating model that has all the advantages of a large technology vendor, but the nimbleness and independence of a smaller startup.
For me, it’s a very compelling story.
And, based on anecdotal evidence – I think our customers and partners are equally enthralled and intrigued by the potential with Pivotal.
It’s something you can believe in.