Environments (Watson Studio and Watson Knowledge Catalog)
When you run operational assets in projects, the runtime environment details are specified by environment definitions.
Environment definitions specify the hardware and software configurations for the environment runtimes:
- Hardware resources include the amount of processing power and available RAM.
- Software resources include the Python, R, or Scala programming languages, a set of pre-installed libraries, and optional libraries or packages that you can specify.
Environment definitions can be defined by:
- The default environment definitions that are included with Watson Studio.
- Custom environment definitions that you create.
You need to specify an environment definition:
-
In projects to:
- Run operational assets in tools like the notebook editor, Data Refinery, model builder, or the flow editor.
- Create jobs to run Data Refinery flows, SPSS Modeler flows, notebooks or Python and R scripts.
- Launch IDEs like RStudio or JupyterLab in Watson Studio in which to run operational assets like notebooks, scripts or Shiny apps.
-
In deployment spaces to:
- Create or run Data Refinery flow jobs
All default and custom environment definitions are listed on the Environments page in the environment definitions list. Clicking an environment definition, displays the environment definition details. An environment runtime is an instantiation of the environment definition. When a runtime becomes active, it is listed on the Environments page in the active environment runtimes list.
Note that GPU and Execution Engine for Apache Hadoop environments are not available by default:
- For Python with GPU environments, the Jupyter Notebooks with Python 3.7 for GPU service must be installed.
- For Execution Engine for Apache Hadoop environments, the Execution Engine for Apache Hadoop service must be installed on the IBM Cloud Pak for Data platform.
After the services are installed, you must create your own environment definitions to use these environments.
The following table lists the default environment definitions or compute power by operational asset type.
* Indicates that Python 3.6 is deprecated. Start running Python notebooks in environments with Python 3.7.
+ Indicates that Spark 2.4 deprecated.
| Operational asset | Programming language | Tool | Environment definition type | Available environment definitions/ compute power |
|---|---|---|---|---|
| Jupyter notebook | Python | notebook editor | Anaconda Python distribution | Default Python 3.7 Default Python 3.6 * |
| Python | notebook editor | Spark | Default Spark 3.0 & Python 3.7 Default Spark 2.4 & Python 3.7 + Default Spark 2.4 & Python 3.6 * + |
|
| Python | notebook editor | Spark | Hadoop cluster | |
| R | notebook editor | Anaconda R distribution | Default R 3.6 |
|
| R | notebook editor | Spark | Default Spark 3.0 & R 3.6 Default Spark 2.4 & R 3.6 + |
|
| Scala | notebook editor | Spark | Default Spark 3.0 & Scala 2.12 Default Spark 2.4 & Scala 2.11 + |
|
| Python | JupyterLab | Anaconda Python distributuion | Default JupyterLab with Python 3.7 Default JupyterLab with Python 3.6 * |
|
| Script | R | RStudio | Anaconda R distribution | Default RStudio |
| Shiny app | R | RStudio | Anaconda R distribution | Default RStudio |
| Data Refinery flow | R | Data Refinery | Spark | Default Spark 2.4 & R 3.6 + |
| R | Data Refinery | Spark | Hadoop cluster |
Learn more
- Environment definitions for the notebook editor
- Environment definitions for JupyterLab
- Spark environment definitions
- GPU environment definitions
- Environment definitions for RStudio
- Environment definitions for Data Refinery
- Refinery data on the Hadoop cluster
- Creating environment definitions
- Customizing environment definitions
- Stopping active runtimes when no longer needed