Quick start: Build and deploy a machine learning model with AutoAI

You can automate the process of building a machine learning model with the AutoAI tool. Read about the AutoAI tool, then watch a video and take a tutorial that’s suitable for beginners and does not require coding.

Required service Watson Machine Learning

Your basic workflow includes these tasks:

Create a project. Projects are where you can collaborate with others to work with data.
Add your data to the project. You can add CSV files or data from a remote data source through a connection.
Create an AutoAI experiment in the project.
Review the model pipelines and save the desired pipeline as a model to deploy or as a notebook to customize.
Deploy and test your model.

Read about AutoAI

The AutoAI graphical tool in Watson Studio automatically analyzes your data and generates candidate model pipelines customized for your predictive modeling problem. These model pipelines are created iteratively as AutoAI analyzes your dataset and discovers data transformations, algorithms, and parameter settings that work best for your problem setting. Results are displayed on a leaderboard, showing the automatically generated model pipelines ranked according to your problem optimization objective.

Watch a video about creating a model using AutoAI

Watch Video Watch this video to see how to create and run an AutoAI experiment based on the bank marketing sample.

Video disclaimer: Some minor steps and graphical elements in this video differ from your Cloud Pak for Data deployment. This video shows the Cloud Pak for Data as a Service user interface.

This video provides a visual method as an alternative to following the written steps in this documentation.

Transcript

Synchronize transcript with video

Time	Transcript
00:00	This video shows you how to run a sample AutoAI experiment to create a Watson Machine Learning model.
00:08	Start in a Watson Studio project and add to that project a new AutoAI experiment.
00:16	To run an AutoAI experiment, you'll need the Watson Machine Learning service.
00:22	Here you have the option to associate a Watson Machine Learning service with this project.
00:29	You can either create a new service instance or select an existing service instance.
00:39	When you return to the page where you're creating the experiment, just reload the page and you'll see the new service instance listed.
00:48	For this first experiment, you will select a sample.
00:52	The "Bank marketing" sample contains text data collected from phone calls to a bank in response to a marketing campaign.
01:01	When you select a sample, the experiment name and description are filled in for you, so you're ready to create the experiment.
01:11	Next, the AutoAI experiment builder displays.
01:15	Since this experiment is from a sample, the bank marketing source file is already selected.
01:22	And the column to predict is also already selected.
01:26	In this case, it's the "y" column, which represents whether a user will sign up for a term deposit as part of the marketing campaign.
01:35	Based on the data set and the selected column to predict, AutoAI analyzes a subset of the data and chooses a prediction type and metric to optimize.
01:47	In this case, since the column to predict contains values of "Y" or "N" (for yes or no) the binary classification was chosen.
01:57	The positive class is "Yes" and the optimized metric is ROC AUC.
02:03	The ROC AUC metric balances precision, accuracy, and recall.
02:10	Now, run the experiment and wait as the "Pipeline leaderboard" fills in.
02:17	During AutoAI training, your data set is split into two parts: training data and holdout data.
02:24	The training data is used by the AutoAI training stages to generate the model pipelines and cross validation scores are used to rank them.
02:34	After training, the holdout data is used for the resulting pipeline model evaluation and computation of performance information, such as the ROC curves and confusion matrices.
02:48	Next, AutoAI generates pipelines using different estimators, such as the XGBoost classifier, or enhancements, such as hyperparameter optimization and feature engineering, with the pipelines ranked based on the accuracy metric.
03:06	Hyperparameter optimization is a mechanism for automatically exploring a search space of potential hyperparameters, building a series of models, and comparing the models using metrics of interest.
03:20	Feature engineering attempts to transform the raw data into the combination of features that best represents the problem to achieve the most accurate prediction.
03:31	Okay, the run has completed.
03:34	The legend explains where to find the data, top algorithm, pipelines, and feature transformers on the relationship map.
03:44	You can view the full log to see complete details.
03:48	By default, you'll see the "Relationship map", but you can swap views to see the "Progress map".
03:57	Scroll down to take a look at the leaderboard.
04:01	You may want to start with comparing the pipelines.
04:05	This chart provides metrics for the eight pipelines, viewed by cross-validation score, or by holdout score.
04:13	You can see the pipelines ranked based on other metrics, such as average precision.
04:21	Back on the "Experiment summary" tab, expand a pipeline to view the model evaluation measures and ROC curve.
04:30	You can view an individual pipeline to see more details in addition to the confusion matrix, precision recall curve, model information, feature transformations, and feature importance.
04:49	This pipeline had the highest ranking, so you can save this as a machine learning model.
04:55	Just accept the defaults and save the model.
05:01	Now that you've trained the model, you're ready to view the model and deploy it.
05:06	The "Overview" tab shows a model summary and the input schema.
05:12	To deploy the model, you'll need to promote it to a deployment space.
05:17	Since this project doesn't have a deployment space associated with it yet, you'll need to set up the association first.
05:25	You can either select an existing deployment space or create a new deployment space.
05:31	When you create a new space, just provide a name and description and select the Cloud Object Storage and Watson Machine Learning service.
05:41	Then create the space.
05:45	Now, select this new space, add a description for the model, and click "Promote".
05:53	Use the link to go to the deployment space.
06:00	Here's the model you just created, which you can now deploy.
06:04	In this case, it will be an online deployment.
06:08	Just provide a name for the deployment and click "Create".
06:12	Then wait while the model is deployed.
06:16	When the model deployment is complete, view the deployment.
06:20	On the "API reference" tab, you'll find the scoring endpoint for future reference.
06:26	You'll also find code snippets for various programming languages to utilize this deployment from your application.
06:35	On the "Test" tab, you can test the model prediction.
06:40	You can either enter test input data or paste JSON input data, then click "Predict".
06:52	This shows that there's a very high probability that the first person will not subscribe to a term deposit and a high probability that the second person will subscribe to a term deposit.
07:06	And back in the project, on the "Assets" tab, you'll find the AutoAI experiment and the model.
07:17	Find more videos in the Cloud Pak for Data as a Service documentation.

Try a tutorial to create a model using AutoAI

In this tutorial, you will complete these tasks:

Create a project.
Build and train the model.
Promote the model to a deployment space and deploy the trained model
Test the deployed model.

This tutorial will take approximately 30 minutes to complete.

Sample data

The sample data used in the guided experience is Bank marketing data used to predict whether a customer will enroll in a marketing promotions.

Task 1: Create a project

You need a project to store the AutoAI experiment.

If you have an existing project, open it. If you don't have an existing project, click Create a project on the home page or click New project on your Projects page.
Select Analytics project as the project type.
Select Create an empty project.
On the Create a project screen, add a name and optional description for the project.
Click Create.

For more information or to watch a video, see Creating a project.

Task 2: Build and train the model

Create the AutoAI experiment, review the model pipelines, and select a pipeline to save as a model.

Download the bank.csv file (0.46 MB) file.
From the Assets tab of your project, click Add to project > AutoAI Experiment.
On the Create a an AutoAI Experiment screen, add a name and optional description for the project.
Click Create.
On the Add data source page that opens, click Browse and open the bank.csv.
If you are asked to create a time series experiement, select No.
Select the column labeled "Y" for the model. This column will be used to predict whether a customer is likely to enroll in a marketing promotion. AutoAI analyzes your data and determines that the Y column contains True/False information, making this data suitable for a binary classification model. The default metric for a binary classification is accuracy and run time.
Based on the data set and the selected column to predict, AutoAI analyzes a subset of the data and chooses a prediction type and metric to optimize. In this case, the prediction type is Binary Classification, the positive class is Yes, and the optimized metric is Accuracy and run time.
Click Run experiment. As the model trains, you will see an infographic that shows the process of building the pipelines.
Once the pipeline creation is complete, you can view and compare the ranked pipelines in the leaderboard.
Select the highest ranked pipeline, and choose Save model from the action menu. This saves the pipeline as a Machine Learning asset in your project.
When the model is saved, click the View in project link in the notification to view the model in your project. Alternatively, you can navigate to the Assets tab in the project, and click the model name in the Machine Learning Model section.

Task 3: Promote the model to a deployment space and deploy the trained model

Now you can promote the model to a deployment space to deploy the model.

Click the Promote to deployment space.
Choose an existing deployment space. If you don't have a deployment space, you can create a new one:
1. Provide a space name and optional description.
2. Select a storage service.
3. Select a machine learning service.
4. Click Create.
5. Click Close.
Click Promote.
When the model is promoted, click the deployment space link in the notification to view the model in your project. Alternatively, you can use the navigation menu to navigate to Deployments, and click the deployment space name.
Next to the model name, click the Deploy icon.
1. Select Online as the Deployment type.
2. Specify a name for the deployment.
3. Click Create.
Click the Deployments tab, and wait for the model to be deployed.
When the deployment is complete, click the deployment name to view the deployment details page.

Task 4: Test the deployed model

Use the deployment to test the model with new data.

Click the Test tab. You can test the deployed model from the deployment details page in two ways: test with a form or test with JSON code.

Click the icon to Provide input data as JSON, then copy the following test data and paste it in the area for the JSON text:

{"input_data":[{
        "fields": ["age","job","marital","education","default","balance","housing","loan","contact","day", "month","duration","campaign","pdays","previous","poutcome"],
        "values": [[27,"unemployed", "married", "primary", "no",1787,"no", "no","cellular",19,"oct", 79, 1, -1, 0, "unknown" ]]
}]}

Click Predict to predict whether a customer with the specified attributes is likely to sign up for a particular kind of account. The resulting prediction indicates that this customer has a high probability of not enrolling in the marketing promotion.

Next steps

Now you can use this data set for further analysis. For example, you or other users can do any of these tasks:

Additional resources

View more videos for AutoAI
Try these additional tutorials to get more hands-on experience with building models in notebooks and using AutoAI:
- Build models using Jupyter notebooks
- Automate model building in Watson Studio