Today, I thought I’d go behind the scenes, and interview one of the primary architects behind the Vblock – Jim Dowson (EMC Distinguished Engineer & CTO, Global Services).
I had the pleasure of hearing Jim and Phil Harris of Cisco (CTO, Strategic Alliances) present the Vblock in detail to one of our partners, and wanted to share some of their insights with you.
Jim, what problem(s) were you trying to solve with the Vblock?
Customers have told us that they believe that they can reduce the cost of computing through virtualization, but they are running into obstacles that prevent them from accelerating their deployment. We wanted to remove those obstacles so that they can fully benefit from the efficiencies that are possible, and we didn’t want them to ‘Nuke & Pave’ their environments in order to get those efficiencies.
It turns out that “Less is the new More’, because by simplifying the architecture, and constraining the solution, you can drive out a lot of the complexity and cost.
It’s interesting to note that we were taking this approach to support a lower cost delivery model for ourselves, for ‘Alpine’ (now Acadia). We needed to be able to provide a price competitive managed service in a highly competitive market – so we built Vblock for ourselves.
There were a number of partners (SPs, Sis and ISVs) that also wanted a best of breed alternative to solutions from other vendors, so we got quite a bit of support and endorsement for the idea.
So, why do you think most customers can’t do this sort of architecture work on their own?
While many can do the work, it’s costly in capital and Non Recurring Engineering (NRE) expense. There is also an ongoing cost to maintain a roadmap for the architecture. For many, the budget for this development work has been been significantly cut, or eliminated altogether.
Customers have told us that they do not see a significant opportunity to competitively differentiate themselves by spending resources experimenting with what have become ‘standard’ components: virtualized x86 systems. They want to get out of IT ‘plumbing’, because that will allow them to focus on things that do bring value to their business, like new application deployment.
It’s also important to note that it’s not just about the technology. You cannot run a virtualized Next Generation Data Center (NGDC) using the same processes that were used in a physical data center – and few customers would have the time or resources to develop those practices.
You have to address the people, process and technology issues in order to get the full benefits.
I remember a great discussion you were having about variable scaling vs. shared resources – can you replay that for us?
We were reviewing two different ways to approach a data center design (both have merits).
The first is central shared infrastructure, which follows the philosophy of “put all your eggs in one basket, and then watch that basket closely” (Andrew Carnegie). This approach requires high resiliency of shared resources, because the failure of a shared resource would lead to a loss of service for dependent consumers of that resource.
The other design option is ‘block step and repeat’. In this approach, there are more (identical) ‘baskets’, and you achieve resiliency through movement of a workload to a different ‘basket’ or ‘block’. This allows more productive assets to be made available for lower cost.
The ‘block’ approach has advantages that include
• isolation (fault, performance and change impact)
• re-use of common design, with well known operating characteristics and failure modes (better supportability)
• better differentiated service levels, while still providing increased utilization through ‘pooling’
• better operational practices that lead to better workload balancing and recoverability
• lower costs by deferring capital purchases to when they are required (Moore’s Law works)
Where do you typically see homegrown architecture-at-scale efforts tend to go wrong?
The challenge is to define units of architecture that are large enough to handle a cost-optimized amount of work. This means that you must design the environment so that all of the resources can be maximally used.
The next challenge is to minimize the number of variations – you need fewer kinds of blocks, with larger pools of those fewer kinds in order to maximize utilization across the whole population.
If you have an environment that is ‘an inventory of one’, you can’t do anything to improve the utilization of that environment – and you’re stuck when it comes to elasticity (workload balancing and dynamic provisioning).
By trying to over-design a single environment, you’re missing the opportunity to optimize your whole data center.
What’s the rationale behind different Vblock models? Wouldn’t it be ideal if a small one could scale into big one?
Vblocks are primarily differentiated by the class of services that they provide (their service catalog), and the ‘step size’ is defined more by the matching of the components for a given workload at a target cost. There’s no point to having an out-of-balance system that has too much of one resource to be useful to the system – you’ll actually make things worse.
You grow the environment by adding more identical blocks that are wired once, and migrating workload across Vblocks. This approach is less disruptive to data center operations.
Were there any aspects of the new technology that made a big difference in the results? For example, converged Ethernet, or perhaps the UCS memory architecture?
FCoE eliminates a significant amount and cost of the components (cables, NICs, ports, fans, power supplies) used to provide connectivity from a blade to the rest of the environment, and you get better utilization of those components through oversubscription. It’s a virtuous cycle, because that also leads to less power, heat and space.
Cisco’s ability to support as much as 4x the amount of commodity SDDR3 RAM on a blade will also allow a greater number of virtual machines to run on a single blade, and virtualization environments tend to be more memory than CPU bound.
I know that there are different schools of thought around laying racks out, airflow, cooling, etc. – what design philosophy did you go with for Vblock, and why?
The nature of the UCS design (front-to- back air flow) reduces power consumption and increases component reliability.
Each Vblock had to be able to stand on its own, with well defined interfaces to the aggregation layer that minimized cabling and dependencies.
You will also see several deployment options, including ‘containerized’ Vblock configurations.
Any other thoughts you’d like to share?
Vblock isn’t a single static thing, it’s a design approach that supports different service catalogs. For example, you will see Vblocks that are optimized for file services, including automated policy-based object level tiering (e.g. EFD, low power SATA or Atmos).
By building more of our intellectual property around Vblocks, we will enable customers to deploy applications more easily without having to do extended science projects to develop infrastructures to support their applications.
There are also significant opportunities to move away from physical appliances to vApps that can be more easily deployed to Vblocks, further increasing flexibility and decreasing costs.
It’s a pretty exciting time in the industry ...