Customizing conda settings

Conda channels are the locations (URLs) where the libraries and packages are stored and from where packages can be downloaded at the time the model is deployed.

The following formats exist for defining conda channels:

  • defaults: This format specifies the default set of public channels, which are automatically searched.
  • https://some.custom/channel: This format enables specifying the entire URL of the repository channel, for example https://yoururl.com:port/conda/channel. This conda configuration format enables accessing private repositories.
  • file:///some/local/directory: This format enables specifying a mounted network drive, including the path to the drive, by adding a prefix to file:///. This format enables access to private repositories on file systems.

You can change the conda channel configuration to access private repositories by modifying the conda runtime configuration file .condarc that you can use to configure:

  • If and how to use a proxy server to access repositories
  • Channels in which conda searches for packages

If you want to define a separate and secure network with fine-grained access control to library repositories, you can configure conda to use a binary repository manager, for example JFrog Artifactory, for library storage and access.

You can add conda packages by:

Required roles:

  • You must be a Cloud Pak for Data cluster administrator to modify the conda configuration file .condarc.
  • You need Admin or Editor permissions on the deployment space to create a custom software specification and package extension.

Before you begin

Make sure that the cc-home storage volume exists and set the MY_TOKEN environment variable. For instructions, see Setting up a storage volume to store customizations for common core services.

Configuring conda to use a proxy server

You can configure conda to use a proxy server as an intermediary to the public conda repositories. You can use a company proxy, or create a remote repository in a binary repository manager, which acts as a proxy to public conda resources. The conda configuration file .condarc can be used.

Follow these steps to configure conda for use behind a proxy server:

  1. Retrieve existing .condarc files by running:

    curl -fSsk ${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda%2F.condarc -H "Authorization: ZenApiKey ${MY_TOKEN}" > .condarc
    
  2. Add the proxy_servers: key to the conda configuration .condarc file. Enter the URL to a company proxy server or a remote proxy server that you have configured and created in a binary repository manager. You can use either http or https protocol. The format is as follows:

    proxy_servers:
        http: http://username:password@corp.com:8080
        https: https://username:password@corp.com:8080
    

    For example:

    proxy_servers:
        https: https://u:a@127.0.0.1:8080
        http: http://u:a@127.0.0.1:8080
    
  3. Copy the modified .condarc file to /cc-home/_global_/config/conda/:

    curl -k -X PUT \
    "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \
    -H "Authorization: ZenApiKey ${MY_TOKEN}" \
    -H "content-type: multipart/form-data" \
    -F upFile=@.condarc
    

    The proxy server is available for running notebooks and jobs.

  4. When a custom software specification and package extension must be created by using the Watson Machine Learning Python client or REST APIs, create a package extension with conda.yaml specification as shown in the example and associate the package extension with customer software specification.

    Note:

    This step is not required when you have a custom environment that is created in Watson Studio and you have a model, Python function, or Python script associated with the same custom environment. In this case, the package extensions and software extensions are created together with the custom environment and the name of the custom software specification is the same as the name of the custom environment.

    Note:

    To run this script, you must generate and export the token as the ${MY_TOKEN} environment variable. For more information, see Generating an API authorization token.

Configuring conda to use a local repository

You can configure conda to use a local onsite repository server with fine-grained access control. With a local repository, you can also control which package versions to use to avoid library dependency conflicts when runtimes are started. You can create local repositories on company servers or in your binary repository manager to which the conda libraries and packages that you selected are added.

To configure conda to use a local onsite repository:

  1. Retrieve existing .condarc files by running:

    curl -fSsk ${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda%2F.condarc -H "Authorization: ZenApiKey ${MY_TOKEN}" > .condarc
    
  2. Add the default channels that map to the local repository URL to the conda configuration .condarc file located in /cc-home/_global_/config/conda/.condarc. Replace <your_local_repository_name> with the URL to a local repository that you connect to through a proxy server.

    channel_alias: http://<your_local_repository_name>:8080/conda/
    
    channels:
        - http://<your_local_repository_name>:8080/conda/anaconda
        - http://<your_local_repository_name>:8080/conda/wakari
        - http://<your_local_repository_name>:8080/conda/r-channel
    
  3. Upload the modified .condarc file by running:

    curl -k -X PUT \
    "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \
    -H "Authorization: ZenApiKey ${MY_TOKEN}" \
    -H "content-type: multipart/form-data" \
    -F upFile=@.condarc
    
  4. When custom software specifications and package extensions must be created by using the Watson Machine Learning Python client or REST APIs, create a package extension with conda.yaml specification as shown in the example and associate the package extension with customer software specification.

    Note:

    This step is not required when you have a custom environment that is created in Watson Studio and you have a model, Python function, or Python script associated with the same custom environment. In this case, the package extensions and software extensions are created together with the custom environment. The name of the custom software specification is the same as the name of the custom environment.

Configuring conda to use a file channel

You can make conda packages available through a file location by uploading them to the shared user_home directory from where the libraries can be accessed through file:// URL.

To create a custom file channel:

  1. Build a conda package that bundles the software files into a single file that can be easily installed and managed. See the conda documentation for details about building packages.

  2. Upload a compressed archive file to a channel folder:

    1. Create the custom channel for your custom packages.
    2. Create a compressed archive of the custom channel folder as a custom-channel.tgz file.
    3. Upload the compressed archive file:
    curl -k -X PUT "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda%2Fcustom-channel?extract=true" -H "Authorization: ZenApiKey ${MY_TOKEN}" -H "content-type: multipart/form-data" -F upFile=@custom-channel.tgz
    
  3. Test that you can access this package (the conda seawater package) from a notebook cell:

    !conda search -c file:///cc-home/_global_/config/conda/custom-channel --override-channels
    !conda install seawater -c file:///cc-home/_global_/config/conda/custom-channel/custom_channel
    
  4. Create a package extension with conda.yaml specification as shown in the example and associate the package extension with customer software specification.

Note:

If you don't want the file channel to be accessible by any user, you can point to a location in a storage volume in IBM Cloud Pak for Data that can be accessed by certain users only.

Parent topic: Customizing with third-party and private Python libraries