Many people are now aware that big data analytics is quickly becoming the next competitive ante: to improve existing business processes, create new ones -- even a foundation for entering entirely new businesses.
The race is now on to acquire -- and maximize the productivity of -- the key talent behind this wave: data scientists and their supporting data science teams.
At EMC, we've been working hard to understand who these people are, what makes them different, how they work -- and what they think is important.
Today, I'm pleased to share with all of you a few key highlights from a recent survey of data scientists and traditional BI analysts.
The work was conducted jointly with StrategyOne to get an "inside-out" view of how these people view the world, and -- especially -- how these data science folks are markedly different than the BI analysts that we're more familiar with.
Lots of useful nuggets here ...
Business and IT leaders realize that developing big data analytics proficiency could mean a lot to their business. There's not a lot of focus on "making the case", it's more about "where can we get started?" and "what use case should we target first?".
These same business leaders are now inevitably curious about the new skill set: data scientists and the data science they practice.
The reference point for many is comparing and contrasting these new individuals with the more familiar business analysts that are scattered across most corporate landscapes. This isn't a judgement statement against or for business intelligence analysts per se, just a familiar starting point for exposing the key differences.
The survey is reasonable in methodology, as far as such studies go: decent reach and depth enough to draw some interesting conclusions.
This is not exactly the first data point confirming this; but it's interesting to note that practitioners acknowledge this as well.
The "technology enablement" side is a rather new perspective: 83% of respondents think that the availability of new technologies will increase demand for the knowledgeable individuals who can harness their potential.
And, soberingly, only a small proportion believe that current BI professionals can fill this gap.
There is clear agreement that "data science" (however you choose to define it) is a fundamentally different profession with a different profilet than the BI analysts that came before it. They're more likely to have advanced degrees, frequently have a background in the sciences (vs. business) and they interact with data in more ways -- and using different tools.
I feel a bit justified with my honest assessment: data science is not your father's BI
Finally, it's clear that data scientists are essentially "data experimenters" vs. rote analysis -- and they're likely to be interacting with IT functions in far more positive ways than the norm.
What Else Is Holding These People Back?
Note the responses. Some are familiar -- things like more budget and resources.
But consider that a third of practitioners will point to needed skills and expertise outside their own function -- a "general proficiency" requirement we're trying to address with new EMC educational offerings such as this.
I found the "wrong organizational structure" and "insufficient executive support" as two sides of the same coin: about a third of practitioners don't feel their company is organized for success. Of course, that probably has something to do with both the "lack of resources" and "lack of broad-based skills" observations.
Like anything else meaningful, you've got to organize for success.
I've made a bit of a practice around sharing with interested parties the patterns we've found in how proficient data science teams are organized (internally and externally aligned) as well as the interesting journeys of how these proficient functions came to be.
Data Scientists: Better Educated -- And Broader
The first juxtaposes the educational profile of the BI analyst and that of the self-identified data science professional. Note the bias towards advanced degrees with the latter.
This corroborates my own experience -- I recently met one fascinating gentleman who had three PhDs in seemingly unrelated fields. You'll always find a strong sense of intellectual curiousty coupled with "show me the data" skepticism in this crowd.
Perhaps more interesting is the educational profile behind the advanced degrees -- data science professionals are twice as likely to have come from an analytically-intensive scientific field vs. the normal business background of the BI analyst.
In my mind, the "science" part of data science is abundantly clear: many of these people are scientists in their own right, and are quite comformatble applying data-driven scientific methods to different fields of pursuit -- like understanding consumer behavior.
Data Scientists Touch Data Across The Entire Data Lifecycle
While many traditional BI analysts are simply functionaries in a larger information-gathering-and-analysis supply chain, the precise opposite seems to be true with data science professionals.
As you can see here, they're involved from everthing from sourcing new data sets (usually from outside the company!) to telling data-driven stories to business stakeholders with the intent of positive change.
This realization inevitably leads to a discussion around tools and platforms that help them do all of this using a single set of tools integrated around their particular workflows (stay tuned for more on this very soon!).
If (a) data scientists are scarce, valuable talent, and (b) they have incredibly broad reach across the data lifecycle, then it logically makes sense to invest in tools and platforms that help them do what they do better -- and with less effort.
That's the way we're looking at it, anyway.
... And Work Across The Organization As Well
In many situations, data scientists find themselves working across the entire organization (in addition to other data scientists, of course).
Look at some of the roles they say they work with frequently: graphic designers, HR professionals, marketing, sales, etc. -- clearly not just technological professions.
One of the questions I often get asked is "is this an IT function?".
While IT clearly has a role to play in enabling data science, this slide makes it pretty clear that data science is a business function, and not a functional support role.
Additionally, when we consider traditional BI analysts, many of them are embedded in one or another functional group: sales, manufacturing, marketing, etc.
This is clearly a different beast.
What Are They Interested In Learning More About?
When these data science professionals were asked "what would you like to learn more about?" the top two answers were data storage and cloud computing.
While I'm obviously pleased that the survey points to two of EMC's core strengths, it took me a moment to realize that raw resources are very much on these people's minds: more storage, more compute.
Perhaps that has something to do with the "we need more resources" observation above :)
And Where Do They Like To Work?
The overwhelming number of data science respondents prefer to work in ostensibly smaller settings. Think focused teams, collegial work environment, easy to navigate the organization, and so on.
When you consider that most larger enterprises that could enormously benefit from applying predictive data science techniques are somewhat larger in size than the buckets given here, you're immediately struck by an interesting management challenge.
The structure, isolation and inflexibility of most corporate environments appears to be something they're not warmly embracing ...
How do you create a smaller, well-resourced setting in a much larger environment that will attract the key talent you'll need?
And that -- above all else -- will probably end up being the magic key to learning to compete through predictive analytics.