Quick start: Build and deploy a machine learning model in a Jupyter notebook

You can create, train, and deploy machine learning models with Watson Machine Learning in a Jupyter notebook. Read about the Jupyter notebooks, then watch a video and take a tutorial that’s suitable for intermediate users and requires coding.

Required services
Watson Studio
Watson Machine Learning

Your basic workflow includes these tasks:

  1. Create a project. Projects are where you can collaborate with others to work with data.
  2. Add a notebook to the project. You can create a blank notebook or import a notebook from a file or GitHub repository.
  3. Add code and run the notebook.
  4. Review the model pipelines and save the desired pipeline as a model.
  5. Deploy and test your model.

Read about Jupyter notebooks

A Jupyter notebook is a web-based environment for interactive computing. If you choose to build a machine learning model in a notebook, you should be comfortable with coding in a Jupyter notebook. You can run small pieces of code that process your data, and then immediately view the results of your computation. Using this tool, you can assemble, test, and run all of the building blocks you need to work with data, save the data to Watson Machine Learning, and deploy the model.

Read more about training models in notebooks

Learn about other ways to build models

Watch a video about creating a model in a Jupyter notebook

Watch Video Watch this video to see how to train, deploy, and test a machine learning model in a Jupyter notebook.

This video provides a visual method to learn the concepts and tasks in this documentation.


Try a tutorial to create a model in a Jupyter notebook

In this tutorial, you will complete these tasks:

This tutorial will take approximately 30 minutes to complete.

Sample data

The sample data used in this tutorial is from data that is part of scikit-learn and will be used to train a model to recognize images of hand-written digits, from 0-9.




Tips for completing this tutorial
Here are some tips for successfully completing this tutorial.

Get help in the community

If you need help with this tutorial, you can ask a question or find an answer in the Cloud Pak for Data Community discussion forum.

Set up your browser windows

For the optimal experience completing this tutorial, open Cloud Pak for Data in one browser window, and keep this tutorial page open in another browser window to switch easily between the two applications. Consider arranging the two browser windows side-by-side to make it easier to follow along.

Side-by-side tutorial and UI

Tip: If you encounter a guided tour while completing this tutorial in the user interface, click Maybe later.



Task 1: Open a project

You need a project to store the data and the AutoAI experiment. You can use an existing project or create a project.

  1. From the Navigation Menu Navigation menu, choose Projects > All projects.

  2. Open an existing project. If you want to use a new project:

    1. Click New project.

    2. Select Create an empty project.

    3. Enter a name and optional description for the project.

    4. Click Create.

For more information or to watch a video, see Creating a project.

Checkpoint icon Check your progress

The following image shows the new project.

The following image shows the new project.




Task 2: Add a notebook to your project

You will use a sample notebook in this tutorial. Follow these steps to add the sample notebook to your project:

  1. From the Assets tab in the project, click New asset > Work with data and models in Python or R notebooks.

  2. Select the URL page.

  3. Paste the following link in the Notebook URL field:

    https://github.com/IBM/watsonx-ai-samples/blob/master/cpd4.8/notebooks/python_sdk/deployments/scikit-learn/Use%20scikit-learn%20to%20recognize%20hand-written%20digits.ipynb
    
  4. Type the notebook name and description (optional).

  5. Select a runtime environment for this notebook.

  6. Click Create. Wait for the notebook editor to load.

  7. From the menu, click Kernel > Restart Kernel and Clear Outputs of All Cells, then confirm by clicking Restart to clear the output from the last saved run.

Checkpoint icon Check your progress

The following image shows the new notebook.

The following image shows the new notebook.




Task 3: Set up the environment

The first section in the notebook sets up the environment by specifying your Cloud Pak for Data credentials and Watson Machine Learning service instance location. Follow these steps to set up the environment in your notebook:

  1. Scroll to the Set up the environment section.

  2. Obtain your API key. If you do not have an API key, follow these steps:

    1. Open another tab with your Cloud Pak for Data deployment.

    2. Click your Profile icon.

    3. Click Profile and settings.

    4. Click API Key > Generate new key.

    5. Copy your API key and close the tab.

  3. The platform URL is the hostname or URL that you use to access your Cloud Pak for Data deployment. Ask your administrator if you don't have this information. The administrator can obtain the platform URL from the Red Hat OpenShift console (Red Hat OpenShift Container Platform -> Projects -> cpd-instance -> Routes).

  4. Paste your username, api_key, and platform url into cell 1.

  5. Click the Run icon Run to run your code in cells 1 and 2.

  6. Run cell 3 to install the ibm-watson-machine-learning package.

  7. Run cell 4 to import the API client and create the API client instance using your credentials.

  8. Run the cell with the code client.spaces.list(limit=10) to see a list of all existing deployment spaces. If you do not have a deployment space, then follow these steps:

    1. Open another tab with your Cloud Pak for Data deployment.

    2. From the Navigation Menu Navigation menu, click Deployments.

    3. Click New deployment space.

    4. Add a name and optional description for the deployment.

    5. Click Create, then View new space.

    6. Click the Manage tab.

    7. Copy the Space GUID and close the tab, this value will be your space_id.

  9. Copy and paste the appropriate deployment space ID into the cell with the code space_id = 'PASTE YOUR SPACE ID HERE', then run that cell and the cell with the code client.set.default_space(space_id) to set the default space.

Checkpoint icon Check your progress

The following image shows the notebook with all of the environment variables set up.

The following image shows the notebook with all of the environment variables set up.




Task 4: Run the notebook

Now that all of the environment variables are set up, you can run the rest of the cells in the notebook. Follow these steps to read through the comments, run the cells, and review the output:

  1. Run the cells in the Explore data section.

  2. Run the cells in the Create a scikit-learn model section to.

    1. Prepare the data by splitting it into three data sets (train, test, and score).

    2. Create the pipeline.

    3. Train the model.

    4. Evaluate the model using the test data.

  3. Run the cells in the Persist locally created scikit-learn model section to publish the model, get model details, and get all models.

    Note:

    If you are using Runtime 24.1 on Python 3.11, then you will need to change the software_spec_uid to runtime-24.1-py3.11 and the scikit-learn version to scikit-learn-1.3.

  4. Run the cells in the Deploy and score section to create the online deployment, get deployment details, and send a scoring request to the deployed model to see the prediction.

  5. Click File > Save File. Section 5. Create a batch deployment and score using connection asset is optional.

Checkpoint icon Check your progress

The following image shows the notebook with the prediction.

The following image shows the notebook with the prediction.




Task 5: View and test the deployed model in the deployment space

You can also view the model deployment directly from the deployment space. Follow these steps to test the deployed model in the space.

  1. From the Navigation Menu Navigation menu, click Deployments.

  2. Click the Spaces tab.

  3. Select the appropriate deployment space from the list.

  4. Click Scikit model.

  5. Click Deployment of scikit model.

  6. Review the Endpoint and Code snippets.

  7. Click the Test tab. You can test the deployed model by pasting the following JSON code:

       {"input_data": [{"values": [[0.0, 0.0, 5.0, 16.0, 16.0, 3.0, 0.0, 0.0, 0.0, 0.0, 9.0, 16.0, 7.0, 0.0, 0.0, 0.0, 0.0, 0.0, 12.0, 15.0, 2.0, 0.0, 0.0, 0.0, 0.0, 1.0, 15.0, 16.0, 15.0, 4.0, 0.0, 0.0, 0.0, 0.0, 9.0, 13.0, 16.0, 9.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 14.0, 12.0, 0.0, 0.0, 0.0, 0.0, 5.0, 12.0, 16.0, 8.0, 0.0, 0.0, 0.0, 0.0, 3.0, 15.0, 15.0, 1.0, 0.0, 0.0], [0.0, 0.0, 6.0, 16.0, 12.0, 1.0, 0.0, 0.0, 0.0, 0.0, 5.0, 16.0, 13.0, 10.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, 5.0, 15.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.0, 15.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 13.0, 13.0, 0.0, 0.0, 0.0, 0.0, 0.0, 6.0, 16.0, 9.0, 4.0, 1.0, 0.0, 0.0, 3.0, 16.0, 16.0, 16.0, 16.0, 10.0, 0.0, 0.0, 5.0, 16.0, 11.0, 9.0, 6.0, 2.0]]}]}
    
  8. Click Predict. The resulting prediction indicates that the hand-written digits are 5 and 4.

Checkpoint icon Check your progress

The following image shows the Test tab with the prediction.

The following image shows the Test tab with the prediction.




Next steps

Now you can use this data set for further analysis. For example, you or other users can do any of these tasks:

Additional resources

Parent topic: Quick start tutorials