I've recently seen a lot of questions and discussions about the emerging role of the Data Scientist: who they are; what they do; how or why they are different from data analysts; and why they are critical to the success of Big Data in an organization.
From some of the discussions it often seems like you cannot hope to get any value from Big Data unless you have not just a Data Scientist but a team of Data Scientists at work. There is a lot that the Data Scientist brings to the table: scientific methodology, higher-level statistical analysis, mathematical modeling. But I've often had the feeling reading through these discussions that there were missing elements.
So I was very pleased to see the article "Big Insights from Big Data Require the Right Data Science Team" from Information Management discussing the insights of Booz Allen Hamilton's consulting team and pointing to their Data Science infographic. The article notes: "The right data science teams blend the technical expertise of computer scientists and mathematicians and statisticians with a critically-important, but overlooked, element—domain knowledge." This lines up with my experience and what I am seeing emerging in the industry.
The Data Scientist brings in these new skill sets, particularly useful as you extend the range of the data in use beyond the organizational basics, that address the "what" in the equation -- what steps need to be taken, what models are applicable, what might the results indicate. They bring a scientific rigor to Big Data. However, as the article notes, they are not the only role involved.
The Information Architect, who could range from a data analyst in some organizations to a data integration specialist to the computer scientist noted in the Booz Allen Hamilton infographic, represents the skill set to make Big Data happen. They address the "how" in the equation. Where you have diverse sets of data with different structures (relational, unstructured, semi-structured), different data types and formats, or even different timing intervals, you need skills to figure out how to put the data together in a meaningful and useful way. This person can understand data models and pull data from traditional sources, work with Hadoop, utilize ETL tools, and put reports together.
And the Domain Expert, whether a business analyst or a subject matter expert or the data steward, brings in the business insight to help identify areas of impact, considerations about the business and the data, and provide a check and validation on the conclusions from the results. These experts have seen the data used in the business processes and they know their industry. They understand when data looks 'right' and when it does not.
The ultimate value, though, is in the blend of the skill sets. The infographic comments that "the ability to fuse disparate, seemingly unrelated data— like financial transaction information, payment records and exchange rates— can produce an entirely new level of insight and direction."
This is the value proposition that Big Data can enable, but as with most initiatives it comes back to the old equation of people, process, and technology with the team of people providing the right Data Science stuff.
As always, the postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions.