Today's announcement is especially gratifying, as one of my team leaders (Bill Bonin) has been working on it diligently for quite a while.
It's finally here.
EMC today announced that its new Competency Center for data warehouse, business intelligence and analytics applications is open for business.
Working with almost all the leading software vendors, we've assembled an impressive capability of technical talent in one place for the benefit of our customers and partners.
So, What's This All About?
There's an interesting statistic floating around that something like 20% of all enterprise storage is used in DW/BI environments. On that basis alone, it shouldn't be a surprise that EMC would take a serious interest in this topic.
More to the point, our customers tell us they're having more and more challenges in these environments.
Sometimes the challenge is simple proliferation -- lots of different DW/BI applications, each with their own ad-hoc infrastructure -- and they'd like to rationalize how these apps are supported, protected, etc. using a consistent approach.
Other times, there's one big app that's having growing pains. Sure, there's always more data and more users, but occasionally a data warehouse grows from simply being a nice-to-have decision support application into an operational part of the business -- and the game changes.
Why Is EMC Involved?
To some, DW/BI is all about writing beautiful optimized queries that make the disks dance in perfect harmony. And, yes, that can be an important topic.
But, from an IT architect's point of view, there's a whole lot more going on than just optimizing SQL.
We're talking serious infrastructure here in many cases: lots of servers, and gobs of storage. And as DW/BI apps have grown, the storage component is coming more and more into play in terms of determining overall performance, flexibility, availability and -- of course -- cost.
Storage design in these environments can be an interesting intellectual exercise, since there are so many variables in play.
For example, usually there's no "one smokin' query" to go consider -- the reality is usually hundreds of ad-hoc requests, some optimized, some not, all hammering the data store. Designing for optimal performance isn't always about maximizing sequential access. Unoptimized queries can hit the data store with random reads, radically changing the performance profile.
Many operational DW/BI environments are now continually updated to provide near-real-time results to queries, meaning write performance now becomes a very interesting topic.
Environments with very large numbers of spindles also may have to figure in rebuild times for failed disks (they do fail occasionally, you know ...) and considering what the resulting performance impact might look like.
More and more DW/BI environments are being designed as HA environments -- with varying degrees of redundancy and failover throughout -- up to and including a remote failover site.
And, oh yes, we've got an entirely new magic ingredient to play with -- enterprise flash drives. While usually impractical for the primary data store, they can do amazing performance magic when applied against the temporary caches used by the DW/BI application.
Oh Yes, There's More ...
Then there's the operational aspect of all this -- how is it all backed up and recovered? Waiting days or weeks for a 100TB environment to be recreated from tape or source data isn't usually an option for most businesses.
What about development, testing and staging of larger DW/BI applications? Things like snaps and replication come into play. And, yes, we've got more than a few customers doing remote DR for their DW/BI environments.
Stepping back from the application itself, all DW/BI environments produce mountains of downstream data -- lots of query results, analysis, data cubes, reports, and more. Sometimes these downstream environments can be much larger than the DW/BI that creates them all.
Think big NAS environments, enterprise content management and workflow, compliant archiving, security and the rest of EMC's information management disciplines, and there's more we can do as well.
Including teaching people to use existing query results, rather than hammering the DW environment each and every time they have a damn question :-)
Take any industrial-class DW/BI environment, factor in the entire landscape, and we're usually talking about a healthy amount of storage, server and software infrastructure.
That gets people's attention.
Over the last few years, I've been amazed at how many new software players are coming into this application space.
Sure, we're working with industry standard players like Oracle, IBM and Microsoft. We're also starting to do more work with the second wave of appliance vendors like Teradata and Netezza.
Most interesting to me are the newest wave of software-only players who can take a relatively standard scale-out server/storage environment and do some amazing things: vendors like GreenPlum, Vertica, DatAllegro and ParAccel.
The other heated debate in the industry is the "dedicated appliance vs. standardized infrastructure".
Some will argue that a customized and bespoke all-in-one environment is best for DW/BI applications. EMC's point of view is that intelligent and optimized use of standardized infrastructure can deliver similar -- or sometimes better -- results, and deliver an operational environment that works pretty much the same as the rest of the landscape.
One of the useful aspects of our new Competency Center is that all of these approaches can be evaluated -- side-by-side if needed -- using a relatively standard server/storage infrastructure environment as a starting point.
So, What's At The EMC DW/BI Competency Center?
We've built this environment at our Santa Clara executive briefing facility. It's a very nice facility. And as EMC deploys more of Cisco's Telepresence, we'll be able to tie in more and more locations remotely.
For starters, EMC's DW/BI engineering team is based in the same location. We've got a sizable group of people that do nothing but qualify and optimize different storage architectures for different DW/BI application scenarios. These folks are right there, and can help out in a variety of ways.
Of course, we've got a sizable tech lab and demo environment set up -- software, servers, and plenty of storage.
All the other EMC products (backup, replication, content management, file servers, security, archiving, etc.) on hand as well that are usually needed to build out a complete operational infrastructure for larger DW/BI environments.
I think the real value of this investment is that we can help customers sort out their options quickly and efficiently, and end up with an overall architectural approach that confidently meets their needs today and for years to come.
What Lies Ahead?
There's just so much interesting stuff we want to do in this space. I'm sure we'll be very busy for quite a while.
For example, there are just a few more vendors we're trying to get into the Competency Center. We've got a surprising majority of them, but there are a few more key ones we'd love to have.
We've just started figuring out what enterprise flash might bring to this party -- where to use it best, and what the impact might be.
We've got a variety of storage interconnects to continue to explore -- traditional NAS, iSCSI and FC, as well as newer FCoE and MPFS (multi protocol file systems) to go characterize.
And then there's the extremely interesting prospect of running DW/BI environments under VMware. We've already established that there's no disk I/O tax using VMware, and in some cases we get better I/O results.
More intriguing is the ability for VMware to take modern four-socket, six-core large-memory server designs, and partition them into comfortable virtualized chunks that DW/BI software can comfortably exploit, with the potential of delivering substantially more aggregate server performance in a given server environment.
Add in the cool operational characteristics like DRS load balancing, transparent HA and the rest of the goodies in a virtualized environment (like running desktop analysis applications using a VDI approach!), and it's just too intriguing to resist exploring.
I'm sure we'll have lots more to say in the not-too-distant future.
If you're doing DW/BI in your environment, and there's growing pressure to take things to the next level, we have a proposition for you and your team.
You're free to access the substantial body of work we've already published to date. Some of this is on the public website, but much more of it can be found on Powerlink.
We've got a cadre of knowledgeable people we can get on the phone, if you'd like, and talk through your specific concerns and questions.
But, if you can swing a trip out west, I'd invite you to take direct advantage of EMC's newest investment in this domain, and visit our new Competency Center soon -- hope to see you there!