Administering Anaconda Repository for Cloud Pak for Data

In Anaconda Repository for IBM Cloud Pak for Data, you can create channels which provides a location in Anaconda Repository to software packages and libraries that you mirrored, uploaded, or copied.

Service Anaconda Repository for IBM Cloud Pak for Data is not available by default. An administrator must install this service.

Required role: You must be a Cloud Pak for Data cluster administrator and have root access on the Linux system where Anaconda Repository for IBM Cloud Pak for Data is installed.

Required access parameters: To use Anaconda Repository for IBM Cloud Pak for Data, you need:

First create one or more channels, then mirror or upload packages from Anaconda Repository to the location each channel points to and lastly, update the conda configuration in Cloud Pak for Data to enable using the channels you created in environment runtimes started in Watson Studio.

Creating channels and mirroring packages

Use Anaconda Repository for IBM Cloud Pak for Data to create channels and then mirror or upload packages.

  1. Login to the Anaconda user management UI (Keycloak) to add new users and manage user access with the link and administrator credentials generated and printed to the screen during installation.
  2. Make sure that your user ID in Keycloak has admin role. Without admin role, you can't create channels in Anaconda TE. See Setting admin role.
  3. Navigate to the hostname of Anaconda TE server to open the Anaconda TE landing page. Log in with the default user credentials generated and printed to the screen during the installation or with the new user credentials created in Keycloak.
  4. Create a channel. See Creating a channel.
  5. In your channel, begin adding packages. You can create a local copy of an entire Anaconda repository or upload individual packages.

    • To create a local copy of a repository, see Creating mirrors. For example, to mirror the conda-forge repository, enter External source channel: https://conda.anaconda.org/conda-forge/ and Type: conda in the UI.

      Note that replicating a conda channel might take a while, even if the channel is passively replicated.

    • To upload individual packages, see Uploading to a channel.

Configuring conda

After you have created channels and have mirrored, uploaded, or copied packages, you need to configure conda in Cloud Pak for Data to access packages from your Anaconda Repository for IBM Cloud Pak for Data installation.

  1. Modify the conda configuration .condarc file for the Cloud Pak for Data cluster to use the channel that you created in Anaconda TE:

    1. In a terminal window, oc login to the OpenShift cluster and change to the project (namespace) where Cloud Pak for Data is installed by entering oc project <projectname>.
    2. Determine the nginx pod that mounts the /user-home/_global_/config/conda/ directory:
      ibm_nginx_pod=`oc get pods -l component=ibm-nginx -o jsonpath='{.items[0].metadata.name}'`
      
    3. Copy an existing .condarc file to your local user home directory or create a new .condarc file under /Users/<username>/.

      1. Check if a .condarc file exists on the server in /user-home/_global_/config/conda/:
        oc exec ${ibm_nginx_pod} -- ls -la /user-home/_global_/config/conda/
        
      2. If no .condarc file exists, create a .condarc file in your local file system in /Users/<username>/.
      3. If a .condarc file exists in /user-home/_global_/config/conda/, copy the file to your local file system in /Users/<username>/:
        oc cp ${ibm_nginx_pod}:/user-home/_global_/config/conda/.condarc /Users/<username>/.condarc
        
    4. Add the channel that points to the local repository URL in Anaconda TE to the .condarc file that you copied or created in /Users/<username>/. You can connect to the Anaconda repository through a proxy server to ensure that all network requests are made via the Anaconda Teams server. See Configuring conda to use a proxy server.

      The following example assumes you created the channel myDataScience:

       channel_alias: http://<AnacondaRepositoryforCPD_URL>/api/repo
       channels:  
           - myDataScience
       default_channels:  
           - http://<AnacondaRepositoryforCPD_URL>/api/repo/myDataScience>
      
    5. For SSL configuration:

      1. Copy the root CA certificate that was used to sign the Anaconda TE server certificate to /user-home/_global_/config/conda/<certificate_name.crt>:
        oc cp /Users/<username>/<certificate_name.crt> ${ibm_nginx_pod}:/user-home/_global_/config/conda/<certificate_name.crt>
        
      2. Set ssl_verify in your local copy of the .condarc file to the path to the certificate:
        ssl_verify: /user-home/_global_/config/conda/<certificate_name.crt>
        
      3. Add the channel to the local repository URL in Anaconda TE to your local copy of the .condarc file located in /Users/<username>/:

         channel_alias: https://<AnacondaRepositoryforCPD_URL>/api/repo
         channels:  
             - myDataScience
         default_channels:  
             - https://<AnacondaRepositoryforCPD_URL>/api/repo/myDataScience>
        
  2. Copy the locally modified .condarc file in /Users/<username>/ back to the cluster in /user-home/_global_/config/conda/:

     oc cp /Users/<username>/.condarc ${ibm_nginx_pod}:/user-home/_global_/config/conda/.condarc
    

    Important: If the .condarc file is generated, all notebook runtimes and jobs started in Watson Studio will use the configuration in this file.

Next step

Provide the names of the conda channels to users with Admin or Editor permissions on Watson Studio projects so that they can create environment definitions with software customizations to Anaconda Repository for IBM Cloud Pak for Data.

To create a software customization to Anaconda Repository for IBM Cloud Pak for Data in an environment definition in Watson Studio, see Customizing the environment definition.

Learn more

To learn more about using Anaconda Repository for IBM Cloud Pak for Data to access open source packages, see: