Important:

IBM Cloud Pak® for Data Version 4.6 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.6 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

Quick start: Analyze data in a Jupyter notebook

You can create a notebook in which you run code to prepare, visualize, and analyze data, or build and train a model. Read about Jupyter notebooks, then watch a video and take a tutorial that’s suitable for users with some knowledge of Python code.

Required service
Watson Studio

Your basic workflow includes these tasks:

  1. Create a project. Projects are where you can collaborate with others to work with data.
  2. Add your data to the project. You can add CSV files or data from a remote data source through a connection.
  3. Create a notebook in the project.
  4. Add code to the notebook to load and analyze your data.
  5. Run your notebook and share the results with your colleagues.

Read about notebooks

A Jupyter notebook is a web-based environment for interactive computing. You can run small pieces of code that process your data, and you can immediately view the results of your computation. Notebooks include all of the building blocks you need to work with data:

  • The data
  • The code computations that process the data
  • Visualizations of the results
  • Text and rich media to enhance understanding

Read more about notebooks

Watch a video about notebooks

Watch Video Watch this video to learn the basics of Jupyter notebooks.

Video disclaimer: Some minor steps and graphical elements in this video may differ from your Cloud Pak for Data deployment. This video shows the Cloud Pak for Data as a Service user interface.

This video provides a visual method as an alternative to following the written steps in this documentation.

Try a tutorial to create a notebook

In this tutorial, you will complete these tasks:

This tutorial will take approximately 15 minutes to complete.

Task 1: Create a project and add an asset

You need a project to store the notebook and data asset. Follow these steps to create a project and add a data asset to the project.

  1. From the Cloud Pak for Data navigation menu Navigation menu, choose Projects > All projects.

  2. If you have an existing project, open it.

  3. If you don't have an existing project, then click New project.

  4. Select Create an empty project.

  5. Enter a name and optional description for the project.

  1. Click Create.

  2. Download the precipitation.csv file

  3. Add the precipitation.csv file to your project:

    1. From your project, click the Upload asset to project icon Upload asset to project icon.

    2. In the side panel that opens, browse to select the precipitation.csv file, and click Open. Stay on the page until the load completes.
      The precipitation.csv file is added to your project as a data asset.

For more information or to watch a video, see Creating a project.

Checkpoint icon for The following image shows the Assets tab in the project. Check your progress

The following image shows the Assets tab in the project.

The following image shows the Assets tab in the project.

Task 2: Add a notebook to your project

Follow these steps to create a new notebook in your project.

  1. In your project, click the Assets tab.

  2. Click New asset > Jupyter notebook editor.

  3. Type a name and description (optional).

  4. Select a runtime environment for this notebook.

  5. Click Create. Wait for the notebook editor to load.

Checkpoint icon for The following image shows the blank notebook. Check your progress

The following image shows blank notebook.

The following image shows the blank notebook.

Task 3: Load a file and save the notebook

Now you can access the data asset in your notebook that you uploaded to your project earlier. Follow these steps to load data into a pandas DataFrame:

  1. Click the Find and add data icon.

  2. From the Files tab, click the Insert to code dropdown next to the data set you added, and insert the pandas DataFrame.

  3. Click Run to run your code. The first few rows of your data set will display.

  4. To save a version of your notebook, click File > Save Version. You can also just save your notebook with File > Save.

Checkpoint icon for The following image shows the notebook with the pandas DataFrame. Check your progress

The following image shows the notebook with the pandas DataFrame.

The following image shows the notebook with the pandas DataFrame.

Task 4: Find and edit the notebook 

Follow these steps to locate the saved notebook on the Assets tab, and edit the notebook:

  1. In the project navigation trail, click your project name to return to your project.

  2. Click the Assets tab to find the notebook.

  3. When you click the notebook, it will open in READ ONLY mode.

  4. To edit the notebook, click the pencil icon Pencil icon.

  5. Click the Information icon Information icon to open the Information panel.

  6. On the General tab, edit the name and description of the notebook.

  7. Click the Environment tab to see how you can change the environment used to run the notebook or update the runtime status to either stop and restart.

Checkpoint icon for The following image shows the notebook with the Information panel displayed. Check your progress

The following image shows the notebook with the Information panel displayed.

The following image shows the notebook with the Information panel displayed.

Task 5: Share read-only version of the notebook

Follow these steps to create a link to the notebook to share with colleagues:

  1. Click the Share icon Share icon if you would like to share the read-only view of the notebook.

  2. Click to turn on the Share with anyone who has the link toggle button.

  3. Select what content you would like to share through a link or social media.

  4. Click Close.

Checkpoint icon for The following image shows the Share dialog box. Check your progress

The following image shows the Share dialog box.

The following image shows the Share dialog box.

Task 6: Schedule a notebook to run at a different time

Follow these steps to create a job to schedule the notebook to run at a specific time or repeat based on a schedule:

  1. Click the Jobs icon, and select Create a job.
    Create a job

  2. Provide the name and description of the job, and click Next.

  3. Select the notebook version and environment runtime, and click Next.

  4. (Optional) Click the toggle button to schedule a run. Specify the date, time and if you would like the job to repeat, and click Next.

  5. (Optional) click the toggle button to receive notifications for this job, and click Next.

  6. Review the details, and click either Create (to create the job, but not run the job immediately) or Create and run (to run the job immediately).

  7. The job will display in the Jobs tab in the project.

Checkpoint icon for The following image shows the Jobs tab. Check your progress

The following image shows the Jobs tab.

The following image shows the Jobs tab.

Next steps

Now you can use this data set for further analysis. For example, you or other users can do any of these tasks:

Additional resources