Over the last few centuries, much of mankind's ingenuity has been focused on eliminating the inconvenience of distance.
From sailing ships to modern air transportation; from hand-carried letters to global telepresence -- we spend a lot of time and money to overcome distance in the physical world.
And, in the next few years, there's going to be an intense focus in IT in doing exactly the same thing.
Why Is This Important?
I don't think most casual observers realize just how important distance is when considering IT at a global scale. The speed of light -- and its associated latency -- is not our friend.
Wikipedia tells us that photons and electrons cruise along at about ~300,000 kilometers per second. That sounds fast, but in the IT world, we care about milliseconds, so that's about ~300 kilometers per millisecond.
Most information transfer protocols require some sort of round-trip acknowledgment, and there's additonal processing along the way, so a handy rule-of-thumb I've heard is that you add about a millisecond of latency for every 100km of network length.
Doesn't sound like much, but it can add up.
You can see the effect if you've ever traveled to Singapore or Sydney, and tried to access a web application in North America or Europe. You may have plenty of bandwidth, but the latency can be downright annoying, especially for "chatty" applications.
Many of us believe that the only practical way to generically solve this problem is to figure out how to get the information closer to the user.
When considering IT infrastructure at global scale, the same problem arises.
If you own multiple data centers in multiple time zones, you'd like to be able to pool your resources if possible. It's one class of problem if everything is in the same room; it's another class of problem entirely when considering doing this on a global scale.
Distance can also be your friend when considering physical data centers. The farther apart they are, the more protected you are from various worst-case business continuity scenarios. For very large companies, there's comfort in knowing that you can potentially conduct global business operations from any continent if needed.
Indeed, in a distance-agnostic world, we think about high availability differently, business continuity differently, resource pooling differently, load balancing differently. We take what we've learned around DRS and vMotion clusters and start thinking really big.
Overcoming distance -- in a cost-effective yet performant manner -- dramatically affects how we think about how many data centers we'll need, how big they need to be, where they need to be located, etc.
On a global scale, this affects many billions of dollars of IT infrastructure investment.
Indeed, as we talk about private clouds -- a dynamic and pooled mix of virtualized resources, controlled by IT, we need to start thinking about serious distances, and how we'll overcome the challenges they present.
The Cisco Whitepaper
This post was largely brought on by an excellent piece of work led by Cisco -- and supported by VMware and EMC -- that might have gotten lost in the VMworld avalanche. There was also a popular joint session that Chad participated in at VMworld on this, link here.
Basically, the paper characterizes what happens as you "stretch" the various networks to moderate (i.e. <200Km) distances.
Even though you'll see EMC storage in the configuration, very little of our storage functionality was being used -- all the "heavy lifting" here was being done by Cisco's network -- a 622Mbps link stretched to 200km.
They talk about three approaches: (1) "shared storage", i.e. application moves, storage doesn't (2) "moved storage", i.e. storage moves first, then application is moved, and (3) "active-active" storage, i.e. a dynamic combination of both.
More on item #3 in a bit.
They do a good job of characterizing latencies and resulting application-level performance as the distance increases. They also give you a good sense of how long it takes to move rather large storage objects around using vSphere as the "mover".
The good news was that application degradation for "shared storage" was much less than I would have expected, and the time it took to move reasonable databases for "moved storage" was reasonable.
I would credit Cisco's networking prowess for this one, especially their IOA feature (I/O acceleration) which I think of as MPIO (PowerPath) for long-distance I/O pipes.
The sobering news was that the application workload was rather modest, the pipe was substantial and the distances only a small fraction of what we'll need going forward.
This is not meant to take away from the fine and very useful work done by all involved, it just shows the magnitude of the next mountain we're going to have to climb.
So where does that leave us?
Towards Information Logistics
The trick here is going to be all about pre-positioning the right amount of data in the right location at the right time -- and at the right cost.
If you've ever been involved with studying global spare parts logistical networks, it's roughly a similar problem in terms of complexity. Put too much stuff at the endpoints, it's expensive and out-of-date. Put too much stuff centrally, and service delivery experiences suffer -- not to mention spending a fortune on air transport.
Put the right stuff in the right place at the right time, and -- you win!! -- for the moment. It's never a static solution.
Stepping back into the world of IT architecture, I believe this "global information logistics" discussion will be very popular in a year or two. It's inevitable from where I sit.
You can see part of this thinking already in the marketplace from EMC today, e.g. EMC Atmos storage platform.
It uses a very flexible policy mechanism that allows relatively static information to be dynamically repositioned globally based on either
- explicit service delivery intent (e.g. "I want to ensure a good experience at all times" using multiple remote copies),
- rapid shifts in demand (e.g. "this particular piece of information is getting very popular" so make more remote copies),
- redundancy (e.g. "never want to lose this" so make redunant copies),
- or cost objectives (e.g. "no one cares much about this piece of information", so compress it and spin the drives down).
What the Atmos storage platform enables is simple: a straightforward expression of "information logistics policy" that balance service delivery, cost and resiliency.
Which leaves us with two open questions: how do we do this also for "hot" data (e.g. a database) and how do we do this without having to get too granular in setting explicit policies?
Serious room for innovation here, I'd offer.
Painting The Big Picture
As our businesses learned to operate globally, we learned how to harness the power of global workforces to create economic and competitive advantage.
We learned to put the right work in the right place at the same time -- and create a global pool of human talent to power our businesses.
Will IT be any different? For those of us who operate at a global scale, will we learn to harness the power of global IT resources to create competitive and economic advantage?
Will we learn to put the right work in the right place at the right time -- and create a "global data center" to power our businesses?
In both cases, we had to get very good at overcoming distance.