Data Science

Building a rock star Data Science team: My first year with the IBM Data Science Elite Team

Blog Home > Building a rock star Data Science team: My first year with the IBM Data Science Elite Team

Building a rock star Data Science team: My first year with the IBM Data Science Elite Team


Reading Time: 4 minutes

It’s been a year since we kicked off the IBM Data Science Elite Team. This group of expert consultants in the field of data science and machine learning (ML) was launched with the goal of helping kick-start our clients’ journey to AI. In fact, we are the first in the industry to provide no-cost data science consultation services for clients as part of our overall efforts to help fill a critical need for these skills.

 

I was the first to join the IBM Data Science Elite Team — and now we are 75 strong and growing to 100. Before this role, I led part of IBM’s Data Science go-to-market efforts focused on data science tools. Before joining IBM, I spent most of my career building expertise around leading teams of highly skilled engineers who approached business problems in new ways — leveraging the latest technology and trends. That’s why, when IBM decided to build a team of AI experts, I immediately knew I could really go beyond the technology to help clients win with AI.

Today, our team helps clients around the world on their AI journey. Our focus? Skills, Process and Tools — all the elements that go into a high-impact data science practice.

Building the right set of data science skills

First, we start with skills by building a team of AI engineers ready to roll up their sleeves. One of the challenges when it comes to finding the right talent is that there is no “one size fits all” type of data science professional. In fact, many struggle to identify with the title because it’s so broad. Scientists in the field of data research have existed for a long time, and they have used many approaches to data mining and extracting information to help companies make better business decisions. However, decisions are usually made after the fact, because technology requires data gathering, mining and human evaluation, and this all takes valuable time. In the Information Age, businesses need to make decisions as they happen, in real time — which is where AI and ML come in.

With this in mind, we’ve been stacking our team with ML engineers who have a specific set of skills. Applicants have to be coders with deep expertise in computer science, ML and deep learning approaches. And of course, they have to be Open Source enthusiasts — meaning, they are not tied to a specific platform or solution. We try to look at candidates’ Github profiles to see how broad they are in area with multiple libraries on similar approaches (e.g., TensorFlow, Keras, Theano). Lastly, we look for experience with distributed processing frameworks, a.k.a. big data. With the explosion of data and digital applications, the data science problems of today are monumental. We’re not looking for a Hadoop admin, but rather a practitioner of open distributed frameworks such as Apache Spark or TensorFlow.

Putting AI to work: Creating the process

Next, we create a process for quick AI prototyping on business problems. We call this our AI kick-start approach. The key to the process is speed to value. This is because when you have a large project, by the time you complete the project, your business requirements change. The key to business innovation is approaching problems in new ways. That is why we created the kick-start program to make it simple and fast.

We start the program with a workshop to discover the AI use case that will deliver the biggest value. This is a full day with all stakeholders, and we focus on first the business problem and then the data available to address it. From there, we focus our scope, define our sprints, document everything and proceed with the project. It’s important to start small but focus on the biggest-bang area where you can show value in a short period of time. This doesn’t mean production-ready, but a prototype which is a working solution that can be carried forward into production.

An open tooling approach

Lastly, we provide the team with all the tools that allow them to build a prototype quickly. Of course, we leverage the IBM Watson Studio platform, which has all the latest open frameworks available as well as some key features to move quickly from discovery to deployment. This coding-optional platform with its broad range of open source tools lends itself particularly well to our clients and their varying data science skill sets.

Over the year we started tackling client problems, moving AI from concepts on PowerPoint decks to reality. A common use case we are seeing is leveraging AI is for customer personalization. Why? Every customer wants one-on-one attention, and AI will always outperform humans at this popular task.

For example, we recently embarked on a project with a large hotel chain. This client was looking to understand their customer journey online to help personalize the experience and drive more online hotel bookings. The project involved identifying customers who visit the hotel website but leave before they finish a booking. There were many challenges to personalizing the customers’ experience accurately. First, we had more than 100 data points (features) that needed cleaning, formatting and evaluating for feature importance. We ended up focusing on 39 relevant features that had the biggest impact on whether a customer would book (e.g., length of visit, loyalty level, and number of searches).

We then trained a predictive model using months of online historical data. This took several attempts, using a few algorithms like logistic regression, random forest and SVM classifier. To validate the feasibility of each attempt, we held out a random population of data to duplicate real-time activity, and then determined if we could predict the likelihood that someone would book a room. We found that the random forest performed the best as a predictive model and from there we focused on personalizing the customers’ search response and experience to drive more online bookings.

In summary, after a year of working with about 100 client use cases worldwide such as Experian, Niagara Bottling and Red Eléctrica de España, I’m proud to say we have a rock star Data Science team that is proving results. The three keys to our success are having: the right skills; the right organizational culture that enables quick solutioning on business problems; and the right set of open, extensible tools in Watson Studio. This combination has allowed us to build a team of AI experts with the ability to quickly prototype on business problems in new ways with everything they need to win.

Are you an IBM client looking to build your data science team? Get in touch with me through LinkedIn at https://www.linkedin.com/in/carloappugliese/, or visit the IBM Data Science Elite Team page. You can also access learning resources and interact with IBM team members and your industry peers on the IBM Data Science Community.

Carlo Appugliese is Program Director of Machine Learning and AI, IBM Analytics

Leave a Comment
1 Comment

Leave a Reply

Your email address will not be published.Required fields are marked *

April Cross Nov 10, 2018

Very informative and exciting work you are doing! Congratulations on your first year and best of luck in the years to come.

0 Replies

Your email address will not be published.Required fields are marked *

More Data Science Stories

Data Science - ** Featured Post **

Timothy Walker

Recap: IBM Data Science and AI at Think 2019

Customers, business partners, and IBMers alike walked out of Think 2019 full of ideas and excitement for the vision driving IBM Data Science and AI.

Data Science

Jennifer C. Clemente

Behind the scenes at Smukfest with IBM Data Science

With the help of IBM, Smukfest is taking data from vendor payments, geolocation, and other sources and using it to improve the festival audience experience.

Data Science

Timothy Walker

IBM Data Science Elite Team teaches decision optimization and more at Think 2019

Decision optimization expert Susara van den Heever and other Data Science Elite Team members will share their knowledge in presentations at Think 2019.