Examples of environment template customizations
Follow the examples that show how to add custom libraries through conda or pip when you create an environment template, by using the provided templates for Python and R.
- You can use
mambain place ofcondain the following examples. Remember to select the checkbox to install frommambaif you add channels or packages frommambato the existing environment template. - The process of creating custom Watson Machine Learning deployment runtimes might be different. If you want to create custom Watson Machine Learning deployment runtimes see Customizing Watson Machine Learning deployment runtimes.
Examples exist for:
- Adding conda packages
- Adding pip packages
- Combining conda and pip packages
- Customizing dependencies that are installed from pip in an air-gapped system
- Adding complex packages with internal dependencies
- Adding conda packages for R notebooks
- Setting environment variables
Hints and tips:
Adding conda packages
To get latest versions of pandas-profiling:
dependencies:
- pandas-profiling
This is equivalent to running conda install pandas-profiling in a notebook.
Adding pip packages
You can also customize an environment using pip if a particular package is not available in conda channels:
dependencies:
- pip:
- ibm_watsonx_ai
This is equivalent to running pip install ibm_watsonx_ai in a notebook.
The customization will actually do more than just install the specified pip package. The default behavior of conda is to also look for a new version of pip itself and then install it. Checking all the implicit
dependencies in conda often takes several minutes and also gigabytes of memory. The following customization will shortcut the installation of pip:
channels:
- empty
- nodefaults
dependencies:
- pip:
- ibm_watsonx_ai
The conda channel empty does not provide any packages. There is no pip package in particular. conda won't try to install pip and will use the already pre-installed version instead.
Note that the keyword nodefaults in the list of channels needs at least one other channel in the list. Otherwise conda will silently ignore the keyword and use the default channels.
Combining conda and pip packages
You can list multiple packages with one package per line. A single customization can have both conda packages and pip packages.
dependencies:
- pandas-profiling
- scikit-learn=0.20
- pip:
- ibm_watsonx_ai
- sklearn-pandas==1.8.0
Note that the required template notation is sensitive to leading spaces. Each item in the list of conda packages must have two leading spaces. Each item in the list of pip packages must have four leading spaces. The
version of a conda package must be specified using a single equals symbol (=), while the version of a pip package must be added using two equals symbols (==).
Customizing dependencies that are installed from pip in an air-gapped system
If you want to customize an environment in an air-gapped system that has no access to a repository server either locally or on the internet, you can store the pip package in the project and specify the dependency by using the prefix
file:/.
The custom channels: configuration can point to an empty local channel to avoid conda trying to fetch pip from an external repository.
Example customization:
channels:
- file:///project_data/data_asset/empty_conda_channel
- nodefaults
dependencies:
- pip:
- file:///project_data/data_asset/your-package-0.1.zip
If needed, you can set up an empty conda channel by running the following commands in a Python notebook cell:
channel_dir="/project_data/data_asset/empty_conda_channel"
!mkdir -p $channel_dir/noarch
with open(channel_dir+"/noarch/repodata.json","w") as f :
f.write('{ "channeldata_version": 1, "packages": {}, "subdirs": ["noarch"] }')
!bzip2 -k $channel_dir/noarch/repodata.json
If your platform administrator uploaded the packages to a directory in a shared volume:
-
Test that you can access this package (for example, the
condaseawater package) from a notebook cell:!conda search -c file:///cc-home/_global_/config/conda/custom-channel --override-channels !conda install seawater -c file:///cc-home/_global_/config/conda/custom-channel/custom_channel -
Create an environment template in your project and add a customization to access the package. Note that you need to use
nodefaultsand notdefaultsforcondaandmambachannels:# Add conda channels below defaults, indented by two spaces and a hyphen. channels: - nodefaults - file:///cc-home/_global_/config/conda/custom-channel/custom_channel # Add conda packages here, indented by two spaces and a hyphen. dependencies: - seawater # Add pip packages here, indented by four spaces and a hyphen. # Remove the comments on the following lines and replace sample package name with your package name. # - pip: # - a_pip_package==1.0
If you don't want the file channel to be accessible by any user, you can point to a location in a storage volume that can be accessed by certain users only.
Adding complex packages with internal dependencies
When you add many packages or a complex package with many internal dependencies, the conda installation might take long or might even stop without returning any error messages. To avoid this:
- Specify the versions of the packages that you want to add. This reduces the search space for
condato resolve dependencies. - Increase the memory size of the environment.
- Use a specific channel instead of the default
condachannels that are defined in the.condarcfile. This avoids running lengthy searches through large channels. See Customizing with conda and mamba.
Example of a customization that doesn't use the default conda channels:
# get latest version of the prophet package from the conda-forge channel
channels:
- conda-forge
- nodefaults
dependencies:
- prophet
This customization corresponds to the following command in a notebook:
!conda install -c conda-forge --override-channels prophet -y
Adding conda packages for R notebooks
The following example shows you how to create a customization that adds conda packages to use in an R notebook:
channels:
- defaults
dependencies:
- r-plotly
This customization corresponds to the following command in a notebook:
print(system("conda install r-plotly", intern=TRUE))
The names of R packages in conda generally start with the prefix r-. If you just use plotly in your customization, the installation would succeed but the Python package would be installed instead of the
R package. If you then try to use the package in your R code as in library(plotly), this would return an error.
Setting environment variables
You can set environment variables in your environment by adding a variables section to the software customization template as shown in the following example:
variables:
my_var: my_value
HTTP_PROXY: https://myproxy:3128
HTTPS_PROXY: https://myproxy:3128
NO_PROXY: cluster.local
The example also shows that you can use the variables section to set a proxy server for an environment.
When installing packages, conda does not use the HTTP_PROXY and HTTPS_PROXY variables that are configured within the environment. If you want to configure conda to use a proxy server, ask your platform administrator
to to configure it for you.
Limitation: You cannot override existing environment variables, for example LD_LIBRARY_PATH, by using this approach. If you want to override existing variables, you can ask your platform administrator to customize
the runtime definition and upload it for you.
Best practices
To avoid problems with missing packages and conflicting dependencies, start by manually installing the packages that you need through a notebook in a test environment. This way you can interactively check if packages can be installed without errors. After you verify that the packages are correctly installed, create a customization for your development or production environment and add the packages to the customization template.
Parent topic: Customizing environment templates