Most IT people by nature are reasonably and justifiably risk-averse. And, if they weren't that way when they started their careers, they certainly become that way over time.
When the brown stuff hits the fan, it's their phone that rings. Fast moving business people aren't always aware of what a really bad IT day can be like, and they expect the IT team to essentially de-risk whatever is being discussed.
Put cloud on the table, and the perceived IT risk profile goes up signficantly. Put big data on the table, and you'll see a similar anxiety-raising profile. Put them next to each other, and you're asking for an IT Nervous Breakdown.
My assertion is two-fold. First, that's where things appear to be inevitably going for many organizations, and the industry as a whole.
And, second, new strategies -- and new enabling technologies -- will most definitely be required.
The Evidence Is Clear
For those of you who may still think that cloud is just a fad -- and doesn't apply to their IT situation -- I would beg to differ. If the cloud wave hasn't hit you yet, it will likely soon.
Another outcome of cloud-ifying enterprise IT is "workload rightsizing". Invest in running internal workloads more efficiently, and hand off to external service providers what's best done by others.
The "cloud dividend" in terms of cost-savings, better IT and ultimate business agility is now exceedingly difficult to deny. Exceptions do exist, but they're getting harder to find these days.
Now the discussion has started to shift to a compelling use case for cloud thinking: big data.
In a nutshell, the premise behind big data is scarily simple. Many of the newer IT-based value propositions are being built around gathering and leveraging vast amounts of information. More data, more value. It's not a problem, it's an opportunity.
Indeed, it's the rare industry where I can't point to at least a few big data examples that would get any business leader excited. It's only a short matter of time before the discussion moves from niche to mainstream.
Big data risk is distinctly different than cloud risk. With big data, you're essentially creating vast piles of information that's usually been lifted from its original context. Many sources, many uses.
It's almost like piling up a bunch of uranium -- good things can happen, but so can bad things. Put "cloud" and "big data" together, and I can't get away from a mental picture of the challenges associated with shipping nuclear fuel around.
So, try these "blue sky" ideas on for size.
None of these capabilities are really available fully formed today, but I believe they serve as interesting thought exercises around what we'll eventually need before long. And, as you'll see, there's plenty to keep us technologists busy for the near future :)
The Infrastructure Must Be Able To Assert -- And Prove -- Its Capabilities
Whether it's performance, capacity, availability, recoverability, security, monitoring, etc. -- we'll need generic mechanisms where infrastructure environment "A" will need to be able to assert its ability to support workload "B".
Not in just one or two focused areas, but across a significant range, and with considerable depth.
Not only that, but "trust me" won't suffice -- there will need to be independent verification mechanisms (trust brokers) of stated capabilities and compliance.
It's hard enough to envision this where there are people involved in a hands-on manner. But that won't ultimately be enough -- the processes and capabilities will need to be largely automated if they are to succeed at scale.
From an EMC perspective, you can see some preliminary work between EMC, RSA, VMware and Cisco in these areas, but we're far from a complete story. Much to do here.
The Application Workload Must Be Able To Describe -- And Verify -- Its Requirements
The other half of the equation are similar capabilities on the other end: the ability for arbitrary workloads (or combinations of workloads) to self-describe what it needs from the infrastructure, and -- more interestingly -- what needs to happen when there's a state change, e.g. performance or availability levels being met.
The widespread adoption of virtualization (particularly VMware) provides a fascinating and productive "surface" for external labelling of requirements. Ideally, the workload would be able to interact with an external trust authority to establish and validate published infrastructure capabilities.
So far, so good. But we really haven't added the unique elements of big data into the mix.
The Information Must Be Able To Describe -- And Verify -- Its Requirements
Imagine, just for a moment, a global health care information repository assembled from dozens of sources around the world. In essence, it carries the summation of all source regulatory requirements as its compliance burden. Add a new information source, add a new compliance requirement.
And we're going to see a lot of this sort of thing going forward.
Contributing sources of the big-data supply chain will need to be able to state their downstream information management and compliance requirements -- and validate back to the local authorities that these requirements are being met.
Consuming data recipients will need corresponding mechanisms to establish information provenance (where did this come from?) but also associated management, compliance and reporting requirements (what do I owe the source?). And, yes, external trust brokers need to play a role here as well.
Does your head hurt yet? Mine does ...
A while back, our own EMC intrapeneuer Steve Todd became fascinated with whole topic of information provenance -- general mechanisms for being able to establish where things came from. Add a whole bunch of metadata, policies and rules, and I'm beginning to realize why he thought it such an important subject.
A Quick Assessment
I can see us living in a world where infrastructure can efficiently describe and prove its capabilities. The enabling technologies are starting to emerge, but there's a lot of work to do in making it all consumable, and -- importantly -- creating enough of an initial critical mass where standardization efforts can come into play.
I can also see us living in a world where applications can efficiently descrive and validate their requirements back to supporting infrastructure. Virtualization is a powerful abstraction that effectively containerizes application logic, which makes it an attractive starting point.
Where it all starts to get fuzzy is trying to envision a world where information flows in and out of different use contexts -- many of them big data related -- and we'll need the same sort of rigorous tracking and verification mechanisms as we perhaps have today with financial flows.
Maybe information and money aren't all that different?