Notebook environments (Watson Studio)

When you run a notebook in the notebook editor in a project, you choose an environment template, which defines the compute resources for the runtime environment. The environment template specifies the type, size, and power of the hardware configuration, plus the software configuration. For notebooks, environment defintions include a supported language of Python or R.

For more information, see:

Included environment templates

The following Python environments are included with Watson Studio. The included environments are listed under Templates on the Environments page on the Manage tab of your project.

Note:

Runtime 24.1 on Python 3.11 is supported on the Z (s390x) platform

Table 1. Default environment templates for Python
Name Hardware configuration
Runtime 24.1 on Python 3.11 1 vCPU and 2 GB RAM

If you have Runtime 24.1 with R4.3 service installed, the following default R environments are listed:

Note:

R-based runtimes for notebook editor do not work on these platforms:

  • IBM Power® (ppc64le) (unless Analytics Engine is installed)
  • IBM Z (s390x)

Table 2. Default environment templates for R
Name Hardware configuration
Runtime 24.1 on R 4.3 1 vCPU and 2 GB RAM
Important:

None of the R-based notebook environments are FIPS-compliant.

For more information on FIPS, refer to Services that support FIPS.

Notebooks and CPU environments

When you open a notebook in edit mode in a CPU runtime environment, exactly one interactive session connects to a Jupyter kernel for the notebook language and the environment runtime that you select. The runtime is started per user and not per notebook. This means that if you open a second notebook with the same environment template in the same project, a second kernel is started in the same runtime. Runtime resources are shared. For more information, see Runtime scope.

If necessary, you can restart or reconnect to the kernel. When you restart a kernel, the kernel is stopped and then started in the same session again, but all execution results are lost. When you reconnect to a kernel after losing a connection, the notebook is connected to the same kernel session, and all previous execution results which were saved are available.

Other environment options for notebooks

You can create notebook environment templates and customize the software configuration. See Creating environment templates.

If you are coding Python notebooks or scripts in the JupyterLab IDE, you can use a JupyterLab environment. See JupyterLab environment templates.

If you have the Execution Engine for Apache Hadoop installed, you can create Hadoop environment templates to run notebooks on your Hadoop cluster. See Hadoop environments.

If you have the Analytics Engine powered by Apache Spark service installed, you can choose from default Spark environment templates with multiple hardware configurations for Python and R. See Spark environments.

If you have the Jupyter Notebooks with Python with GPU service installed, you can create an environment template to run notebooks on GPU clusters. See GPU environments.

File system in Jupyter notebook environments

Runtimes are started per user, but standard data storage that is mounted in the runtimes is shared with other project members. This means that other project members can access the files that you upload to your notebook environment. If your team is working on a large data set, this might help you avoid duplicating the effort and using up extra resources.

You must be mindful of the size of the data files that you upload to your notebook environment. Very large files might require more storage (disk space) than is available on the node on which the runtime is started.

Do not confuse storage space and the memory size of your environment. Selecting a larger environment gives you more memory and CPU, but not more storage space.

If your data files are large, consider switching to using Spark or Hadoop to process these files. With Spark or Hadoop, the processing workload is spread across multiple nodes.

The default file storage limit is 100Gi per Cloud Pak for Data instance. Depending on the storage class that your cluster is using, it might be possible to increase this limit. If you need more space for your data, contact your cluster administrator.

The file system of each runtime is non-persistent. If you stop a runtime, the files that you uploaded are deleted. To make sure that your data is not deleted, use external, persistent storage. Persistent file systems that you reference in your notebook are not destroyed when the environment is stopped.

Runtime scope

Environment runtimes are always scoped to an environment template and a user within a project.

For example, if you associate each of your notebooks with its own environment, each notebook will get its own runtime. However, if you open a notebook in an environment which you also selected for another notebook and that notebook has an active runtime, both notebook kernels will be active in the same runtime. In this case, both notebooks will use the compute and data resources available in the runtime that they share.

If you want to avoid sharing runtimes but want to use the same runtime configuration for multiple notebooks in a project, you can create custom environment templates with the same specifications. See Creating environment templates.

Next steps

Parent topic: Environments