IBM Watson Studio named a 2020 Gartner Peer Insights Customers’ Choice: Q&A with a lead architect
IBM Watson Studio enables organizations to develop models and simplify and scale AI across any cloud while simultaneously automating the AI lifecycle. What do end users say about it? Here are a few quotes from among 94 reviews of Watson Studio on Gartner Peer Insights, a free peer review and ratings platform.*
“More staff can contribute with the robust set of applications available with Watson Studio.”
“A one-stop-shop for our data science. It precludes the requirement to cobble together all of the components (R, Python, Jupyter Notebooks) separately.”
“Best in market to develop AI products.”
We are pleased to note that because Watson Studio is among the most highly rated solutions in its category over the past year, it has been named a 2020 Gartner Peer Insights Customers’ Choice for Data Science and Machine Learning Platforms. Read the report.
There’s an interesting story behind the development of the product, and I sat down with IBM’s Distinguished Engineer, Thomas Schaeck to learn about the architectural considerations under the hood and the benefits organizations gain with Watson Studio.
1. Thomas, what’s your background? How did you get into data science?
I joined IBM right after finishing University at the Karlsruhe Institute of Technology in Germany. I first worked as a C/C++ developer, then started with Java soon after it came out. Then I led architecture and technical strategy for products such as WebSphere Portal, standard Java Portlet API and web services, IBM Connections, and IBM OpenPages.
In 2015, I joined the IBM Cloud team, where we started a new project to enable data scientists to analyze data on the IBM Cloud, initially using Jupyter Notebooks. I was immediately fascinated by the possibilities of data science over enterprise data, and that we could enable teams of data scientists in enterprises across the world to quickly get new insights from data and to create and train models — making predictions to help businesses serve their customers better and faster.
2. What challenges led IBM to build Watson Studio?
Customers needed a scalable yet easy-to-use way for their data scientists to analyze data and gain insights. Watson Studio was originally called Data Science Experience, and it joined other services on the IBM Cloud such as databases, object storage, and Spark.
Initially we focused on providing Jupyter Notebooks on top of the Spark service to allow data analysis on the IBM Cloud, and via remote connections to data on other clouds or on-premises data sources.
Then we saw a need to support better collaboration between data engineers, data scientists, subject matter experts and DevOps teams. This drove us to create projects integrating Notebooks with concepts such as Connections, Data Assets, Data Refinery and SPSS Modeler Flows.
3. How have your teams expanded Watson Studio?
- Created a version for customers’ private clouds
- Integrated with Watson Knowledge Catalog to allow customers to access data with intelligent cataloging and active metadata and policy management
- Enabled models created and trained in Watson Studio Projects to be deployed for online or batch scoring using Watson Machine Learning
- Included these features in Cloud Pak for Data
- Established Watson Studio Desktop to enable visual or programmatic model development from anywhere
4. What’s AutoAI and why did it win an award?
AutoAI in Watson Studio uses AI to build AI, reducing the steps required to build a model by as much as 80 percent.
AutoAI can analyze raw data, select algorithms, detect features, train and evaluate candidate models and rank them in a leader board with model metrics and visualizations.
Users can select the training pipeline or model that best fits their needs, and they can access the Python code behind a pipeline to refine it further. AutoAI helps data scientists work on higher value tasks and it enables users without substantial data science skills to create and train models. All these features earned AutoAI a Best Innovation in Intelligent Automation award.
5. What’s decision intelligence and how does Watson Studio enable it?
Many artificial intelligence models can predict, but decision intelligence takes those predictions and optimizes the actions that result from them by combining machine learning and decision optimization algorithms.
For example, Watson Studio can help teams build a model to predict which customers will be interested in a product. IBM Decision Optimization can then be used to take optimization targets and constraints into account and identify which offers and channels will deliver the optimal revenue/expense ratio in winning business from those customers.
6. What are customers achieving with Watson Studio?
Customers use Watson Studio for a variety of use cases:
- Lufthansa wanted to speed and scale AI development, so it modernized and centralized its data science tool landscape using IBM solutions such as Watson Studio. “This is now an open platform for all our data scientists,” says Lufthansa’s Head of Data and Analytics.
- Anyline, an expert in mobile text recognition, used IBM Deep Learning Service, an element of Watson Studio, to teach smart phones to read, enabling data to be processed up to 20 times faster than previous solutions.
- A lead data scientist at Fifth-Third Bank says Watson Studio “allows us to explore many more models and run many more iterations in the same amount of time, helping us get results the business needs, fast.”
- In 10 minutes, Honda R&D can use a solution that includes Watson Studio to analyze over a million documents and highlight examples of driver behavior.
- Wunderman Thompson, a global creative agency, uses Watson Studio and IBM Cloud Pak for Data to release data from silos and help predict post-Covid19 strategies, drawing from anonymized data on 270 million people and $1.1 trillion in transactional data. Its machine learning pipeline increased the performance over its previous models by 200 percent or more.
7. What challenges did your team need to solve to deliver Watson Studio?
We needed to provide an integrated experience for users while also using a modular, extensible architecture. We had to make sure that more services and tools could be added while keeping an integrated end-to-end experience.
The most exciting breakthrough was our development of Watson Studio capabilities, including data access, data refinement, data analysis, and model training, all integrated end to end. We achieved this with a microservice architecture. We have project UI and API services and an overall header and navigation service as horizontal elements, and a range of microservices with a UI and API tier for the vertical tools and functions, such as Notebooks, Flows, AutoAI and Refinery.
Another challenge was to find a way to run Watson Studio across IBM Cloud, private clouds and third-party public clouds. We mapped our microservices to Kubernetes pods that we deploy on the IBM Cloud Kubernetes Service to run on the IBM Cloud. We also deploy on OpenShift Kubernetes to run on private clouds and third-party public clouds.
8. What do you think of the many different architectural approaches to building a platform in the areas of cloud, apps and data? For instance, some solutions take a cloud-led approach, others come from an analytics-specialist angle, and there are many new tool vendors.
With Watson Studio, from a product architecture point of view, we are on a cloud-first DevOps model. Parallel squads own DevOps for their respective microservices, with updates on a weekly basis, with the ability to apply fixes whenever needed. First, we make a new function available to IBMers, then to selected customers and then to all users on IBM Cloud. Then we make the new function available in releases of Watson Studio on Cloud Pak for Data and Watson Studio Desktop.
The users we focused on were data scientists who have Python or R skills. Then we expanded to enable data engineers to connect and catalog data, and AI Ops teams to take models to production in managed deployments. More recently, with the addition of AutoAI, we enabled a broader user base to train models, and with IBM Decision Optimization integration we enabled optimization experts to contribute.
9. Model Operations (ModelOps) is a megatrend. What is IBM is doing to synchronize apps and AI development cadences?
We wanted to enable applications using AI models to be always-on, so we introduced the concept of stable model endpoints (REST APIs), behind which new models can be deployed or previous models updated. To avoid disrupting service, we allow customers to add a new model or model version in the background behind a Watson Machine Learning model deployment, and then when it is available, customers can switch the endpoint to the new model and receive incoming requests without disruption.
On IBM Cloud, we run Watson Machine Learning (WML) and deploy models across three availability zones to guarantee high availability. With Watson Machine Learning on Cloud Pak for Data, customers can deploy WML and models on WML on two or more independent Cloud Pak for Data clusters, e.g. with an API gateway used for load balancing and failover to maximize availability.
10. In working with IBM Research, how did you crack the code to bring advanced research into an enterprise data science and AI platform?
A good example of our collaboration with IBM Research is the development of AutoAI. It first began as their project, and our product teams involved sponsor customers and applied design thinking to make it an integrated product capability. We worked with IBM Research to tweak AutoAI as a result of user feedback and released it via Watson Studio on IBM Cloud and on Cloud Pak for Data. Our collaboration with IBM Research continues on many fronts including AI governance and federated learning.
11. IBM has many products and innovative ideas in progress. How are they being synchronized to develop optimum solutions?
We initiated a project dedicated to drive standardization while bringing new innovations through Cloud Pak for Data and IBM Cloud. This is a framework of architectural principles and specifications with a set of commonly used APIs and base services, enabling each team to focus on functions that matter for users, while avoiding the duplication of underlying base components. Some examples are shared services for connectors, projects, spaces, logging and lineage.
12. Do you have advice for people who want to grow their skills in data science or are just starting?
Join the IBM Data Science Community for free and get a complimentary month to explore the popular IBM data science courses on Coursera.
Start with Watson Studio Cloud at no cost, without having to install anything.
Take this 45-minute product tour and get hands-on experience predicting customer churn and optimizing offers to keep customers.
And join me in a Data and AI Innovation Exchange webinar about ModelOps—and what you need to consider in automating AI lifecycle management. Register here.
*Gartner, Gartner Peer Insights ‘Voice of the Customer’: Data Science and Machine Learning Platforms, 10 July 2020, Peer Contributors
The Gartner Peer Insights Customers’ Choice badge is a trademark and service mark of Gartner, Inc., and/or its affiliates, and is used herein with permission. All rights reserved. Gartner Peer Insights Customers’ Choice constitute the subjective opinions of individual end-user reviews, ratings, and data applied against a documented methodology; they neither represent the views of, nor constitute an endorsement by, Gartner or its affiliates.