ModelOps use case
By setting up a ModelOps process for your data, your company can benefit from a full end-to-end AI lifecycle that optimizes your data and AI investments.
Overview
Your company needs to ensure that data is collected and explored efficiently and that the AI models that use the data are properly built and governed. You need integrated systems and processes to manage data and model assets across the AI lifecycle.
With Cloud Pak for Data, your company can manage the full AI lifecycle from a single platform with integrated services that support the entire flow from collecting data all the way to monitoring your models in production.
You can install the Cloud Pak for Data services that support ModelOps to improve, simplify, and automate AI lifecycle operations and management. You can streamline and accelerate data collection and management, model development, model validation, and model deployment. With Cloud Pak for Data, you can operate trusted AI through ongoing model monitoring and retraining on an end-to-end unified data and AI platform, and then use the resulting predictions to decide on the actions to address your company’s needs.
Watch the following video to see the steps in the ModelOps process.
This video provides an audio-visual presentation of the written use-case flow description in this documentation.
Process
You can use different tools and services for each step in the process, depending on how you want to implement your ModelOps use case in Cloud Pak for Data.
1. Collect the data
Collecting and organizing data is an important step in building your automated AI pipeline. Data scientists create projects, and data engineers collect data and add it to the projects so it can be organized and refined. You can collect data from multiple sources and ensure that it is secure and accessible for use by the Cloud Pak for Data tools and services that support your ModelOps AI lifecycle. You can address policy, security, and compliance issues to help you govern the data that is collected before you analyze the data and use it in your AI models.
Services and tools you can use | What you can do | Best to use when |
---|---|---|
Watson™ Knowledge Catalog |
|
Use Watson Knowledge Catalog for data collection when you need an inventory of data connections and data sets at the organizational level so data scientists and analysts can work with the data for various projects. |
Data Virtualization |
|
With Data Virtualization, you can query many data sources as one. Use Data Virtualization for data collection when you need to combine live data from multiple sources to generate views for input for projects. For example, you can use the combined live data to feed dashboards, notebooks, and flows so that the data can be explored. |
Data Refinery |
|
With Data Refinery, you can simplify the process of preparing large amounts of raw data for analysis. Use Data Refinery for data collection when you need to access and join or filter data and materialize the results as data assets that represent a point in time. You can use the data as input for analysis or model training. |
DataStage® |
|
Use DataStage when you need to quickly design and run accurate data flows by using an intuitive interface that lets you connect to a wide range of data sources. You can integrate and transform data, and deliver it to your target system in batch or real time. |
2. Explore the data
To gain new insights and make business decisions, you can analyze and explore the data that you will use to build AI models. Data engineers can further refine the data in the projects, and data scientists can use various Cloud Pak for Data services, tools, and features to import, explore, and analyze the data before it is used in your AI models.
Services and tools you can use | What you can do | Best to use when |
---|---|---|
Watson Studio
|
|
Use Watson Studio to add relevant data to your projects from connections, connected data assets, and uploaded files so you can visualize, explore, analyze, and train models. |
Data Refinery |
|
Use Data Refinery visualizations to view and explore the data interactively to better understand it. |
Watson Studio:
|
|
Use Watson Studio dashboards and notebooks to view and explore the data interactively to better understand it. |
3. Build the models
To get predictive insights based on the data that you collected, refined and analyzed, the next step is to build and train models. Data scientists use Cloud Pak for Data services to build the AI models, ensuring that the right algorithms and optimizations are used to make predictions that help to solve business problems.
Services and tools you can use | What you can do | Best to use when |
---|---|---|
Watson Studio
and Watson Machine Learning
|
|
Use AutoAI when you want an advanced and automated way to build a good set of training pipelines and models quickly, and you want to be able to export the generated pipelines to refine them. |
Watson Studio
and Watson Machine Learning
|
|
Use notebooks and scripts to build models that use ML algorithms and frameworks when you want to use Python or R coding skills to have full control over the code that is used to create, train, and evaluate the models. |
Watson Studio
and Watson Machine Learning
|
|
With SPSS Modeler flows, you can build flows to prepare and blend data, build and manage models, and visualize the results. Use SPSS Modeler to build models when you want a simple way to explore data and define model training, evaluation, and scoring flows. |
RStudio® Server with R 3.6 |
|
Use RStudio Server with R 3.6 when you want to use a development environment to work in R. |
Watson Machine Learning Accelerator |
|
Use Watson Machine Learning Accelerator you want to train thousands of models, train deeper neural networks, and explore more complicated hyperparameter spaces. |
Decision Optimization |
|
Use Decision Optimization when you need to evaluate millions of possibilities to find the best solution to a prescriptive analytics problem. |
4. Deploy the models
When operations team members deploy your AI models, the models become available for applications to use for scoring and predictions to help drive actions.
Services and tools you can use | What you can do | Best to use when |
---|---|---|
Watson Machine Learning
|
|
Deploy models and other assets to test or production environments by using a simple user interface. |
Watson Machine Learning
|
|
Deploy and manage models to test or production environments from a command-line. |
Watson OpenScale
|
|
Use the Python SDK for development and automation when you want to configure data, add your machine learning engine, and select and monitor deployments. |
5. Monitor the models
After models are deployed, it is important to govern and monitor them to make sure that they are explainable and transparent. Data scientists need to be able to explain how the models arrive at certain predictions so that they can determine whether the predictions have any implicit or explicit bias. In addition, it's a best practice to watch for model performance and data consistency issues during the lifecycle of the model.
Services and tools you can use | What you can do | Best to use when |
---|---|---|
Watson OpenScale |
|
Use Watson OpenScale to monitor models when you have features that are protected or that might contribute to prediction fairness, you need to trace model performance and data consistencies over time, or you need to know why the model gives certain predictions. |
Examples
- Case study
-
To see an end-to-end example of a ModelOps scenario that uses Cloud Pak for Data and some of the key services, read ModelOps approach to modernizing your bank loan department.
- Industry accelerators
-
You can use industry accelerators to help you implement ModelOps processes with Cloud Pak for Data. An industry accelerator is a set of artifacts that help you address common business needs. For example, you might use the Financial Markets Customer Attrition Prediction accelerator, which uses Cloud Pak for Data with Watson Knowledge Catalog, Watson Studio, and Watson Machine Learning to help you to predict the customers that might leave. You can browse the Accelerators catalog for the Cloud Pak for Data industry accelerators and download the ones that you want to use.