Customizing with pip
You can add custom packages from the Python Package Index, PyPI without access to the public network.
You can use these methods to add custom packages:
- Configuring
pipto use a proxy server - Using
pipfrom IBM Cloud Pak for Data storage - Using
pipfrom Project storage
Prerequisites
- You must be a Cloud Pak for Data cluster administrator to create a
pipconfiguration filepip.conf. - You need Admin or Editor permissions on the project to create an environment template and add a software configuration.
- For changes to take effect, after you copy
pip.confto the cluster, restart the running runtimes.
Before you begin
Set the CPD_URL and TOKEN environment variables and make sure that the cc-home storage volume exists.
For instructions, see Setting up a storage volume to store customizations for common core services.
You can check if the cc-home storage volume is set up correctly by running this code in a Jupyter Python notebook in Watson Studio:
import os
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space()
token = wslib.auth.get_current_token()
cpd_url = os.environ["RUNTIME_ENV_APSX_URL"]
%set_env CPD_URL={cpd_url}
%set_env TOKEN={token}
!curl -k ${CPD_URL}/zen-volumes/cc-home/v1/volumes/directories/%2F_global_%2Fconfig%2Fconda -H "Authorization: Bearer ${TOKEN}"
The return message must include "status":"200". If you get an error, check the setup instructions again.
You can set the configuration that is used by pip in the /cc-home/_global_/config/conda/pip.conf global file. These settings apply to all Watson Studio runtimes and all users. The /cc-home/_global_/config/conda/pip.conf file is read-only. The following instructions show how to modify the file by using REST API commands.
After making the modifications, run python3 -m pip config debug to check if the modifications in pip.conf are applied correctly.
An update to pip.conf is activated when a runtime is started. If you want to apply the changes to the global pip.conf file, you must restart the existing runtime.
Using a proxy server or internal package index with pip
You can configure pip for use behind a proxy server by creating your own clusterwide pip configuration file called pip.conf in which you can specify your own package index or a proxy user.
-
Run the following commands in a notebook to test if the connection is working.
For a proxy server, run this command:
!python -m pip install langdetect --proxy https://www.example.com:<port number>For an internal index, run this command:
!pip install <some_package> --index-url=http://www.example.com/root/pypi/+simple/ --trusted-host=http://www.example.comIf the connection is not working, resolve any networking or firewall issues before proceeding with the next step.
-
Retrieve any existing
pip.conffiles by running:curl -vk ${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda%2Fpip.conf -H "Authorization: ZenApiKey ${MY_TOKEN}" -o pip.confNote:To run this script, you must generate and export the token as the
${MY_TOKEN}environment variable. For details, see Generating an API authorization token. -
Configure a proxy server or an internal repository server.
To configure a proxy server, set
pip.confas follows:[global] proxy=https://<user>:<password>@<proxy name>:<port>To always use an internal repository server, set
pip.confas follows:[global] index-url=https://www.example.com/root/pypi/+simple/ trusted-host=https://www.example.com -
Copy the
pip.conffile to the shared file system:curl -k -X PUT \ "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \ -H "Authorization: ZenApiKey ${MY_TOKEN}" \ -H "content-type: multipart/form-data" \ -F upFile=@pip.conf
Installing pip packages from IBM Cloud Pak for Data storage
You can add a custom Python distribution package to IBM Cloud Pak for Data storage and then access these packages directly from within a notebook. Alternatively, you can add a configuration to your environment runtime with the file path to the package where it can be picked up by the runtime builds.
-
Create a Python project with a setup.py build script.
-
Generate a distribution package.
-
Upload the zipped archive file to
/cc-home/_global_/config/conda.curl -k -X PUT \ "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \ -H "Authorization: ZenApiKey ${MY_TOKEN}" \ -H "content-type: multipart/form-data" \ -F upFile=@<package_name>Note:To run this script, you must generate and export the token as the
${MY_TOKEN}environment variable. For details, see Generating an API authorization token. -
Create an environment template in the project and add a customization for
pip:# Add conda channels below defaults, indented by two spaces and a hyphen. channels: - nodefaults # Add conda packages here, indented by two spaces and a hyphen. dependencies: # Add pip packages here, indented by four spaces and a hyphen. # Remove the comments on the following lines and replace sample package name with your package name. - pip: - file:///cc-home/_global_/config/conda/Archive.zip
Installing pip packages from project storage
You can add a custom Python distribution package to your project in Watson Studio and then add a configuration to your environment template with the file path to the data assets folder in the project.
-
Create a Python project with a setup.py build script.
-
Generate a distribution package.
-
Upload the zipped distribution file to your project as a data asset.
-
Create an environment template in your project and add a customization:
# Add conda channels below defaults, indented by two spaces and a hyphen. channels: - nodefaults # Add conda packages here, indented by two spaces and a hyphen. dependencies: # Add pip packages here, indented by four spaces and a hyphen. # Remove the comments on the following lines and replace sample package name with your package name. - pip: - file:///project_data/data_asset/Archive.zip
If your environment is air-gapped and you do not plan to use any conda packages from external sources, create a .condarc file in the same folder with the following content:
offline: True
This ensures that conda filters out all channels that do not use the file:// protocol.
You can also include pip configuration options in a conda environment as pip dependencies. An example is:
dependencies:
- pip:
- --index-url https://www.example.com/artifactory/token
- --trusted-host https://www.example.com
- langdetect
You can:
- Have a global
pip.conffile - Specify
pipoptions in a environment customization - Use both methods
Parent topic: Customizing environments