Customizing with pip
You can add custom packages from the Python Package Index, PyPI without access to the public network.
You can use these methods to add custom packages:
- Configuring
pip
to use a proxy server - Using
pip
from IBM Cloud Pak for Data storage - Using
pip
from Project storage
Prerequisites
- You must be a Cloud Pak for Data cluster administrator to create a
pip
configuration filepip.conf
. - You need Admin or Editor permissions on the project to create an environment template and add a software configuration.
- For changes to take effect, after you copy
pip.conf
to the cluster, restart the running runtimes.
Before you begin
Set the CPD_URL
and TOKEN
environment variables and make sure that the cc-home
storage volume exists.
For instructions, see Setting up a storage volume to store customizations for common core services.
You can check if the cc-home
storage volume is set up correctly by running this code in a Jupyter Python notebook in Watson Studio:
import os
from ibm_watson_studio_lib import access_project_or_space
wslib = access_project_or_space()
token = wslib.auth.get_current_token()
cpd_url = os.environ["RUNTIME_ENV_APSX_URL"]
%set_env CPD_URL={cpd_url}
%set_env TOKEN={token}
!curl -k ${CPD_URL}/zen-volumes/cc-home/v1/volumes/directories/%2F_global_%2Fconfig%2Fconda -H "Authorization: Bearer ${TOKEN}"
The return message must include "status":"200"
. If you get an error, check the setup instructions again.
You can set the configuration that is used by pip
in the /cc-home/_global_/config/conda/pip.conf
global file. These settings apply to all Watson Studio runtimes and all users. The /cc-home/_global_/config/conda/pip.conf
file is read-only. The following instructions show how to modify the file by using REST API commands.
After making the modifications, run python3 -m pip config debug
to check if the modifications in pip.conf
are applied correctly.
An update to pip.conf
is activated when a runtime is started. If you want to apply the changes to the global pip.conf
file, you must restart the existing runtime.
Using a proxy server or internal package index with pip
You can configure pip
for use behind a proxy server by creating your own clusterwide pip
configuration file called pip.conf
in which you can specify your own package index or a proxy user.
-
Run the following commands in a notebook to test if the connection is working.
For a proxy server, run this command:
!python -m pip install langdetect --proxy https://www.example.com:<port number>
For an internal index, run this command:
!pip install <some_package> --index-url=http://www.example.com/root/pypi/+simple/ --trusted-host=http://www.example.com
If the connection is not working, resolve any networking or firewall issues before proceeding with the next step.
-
Retrieve any existing
pip.conf
files by running:curl -vk ${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda%2Fpip.conf -H "Authorization: ZenApiKey ${MY_TOKEN}" -o pip.conf
Note:To run this script, you must generate and export the token as the
${MY_TOKEN}
environment variable. For details, see Generating an API authorization token. -
Configure a proxy server or an internal repository server.
To configure a proxy server, set
pip.conf
as follows:[global] proxy=https://<user>:<password>@<proxy name>:<port>
To always use an internal repository server, set
pip.conf
as follows:[global] index-url=https://www.example.com/root/pypi/+simple/ trusted-host=https://www.example.com
-
Copy the
pip.conf
file to the shared file system:curl -k -X PUT \ "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \ -H "Authorization: ZenApiKey ${MY_TOKEN}" \ -H "content-type: multipart/form-data" \ -F upFile=@pip.conf
Installing pip
packages from IBM Cloud Pak for Data storage
You can add a custom Python distribution package to IBM Cloud Pak for Data storage and then access these packages directly from within a notebook. Alternatively, you can add a configuration to your environment runtime with the file path to the package where it can be picked up by the runtime builds.
-
Create a Python project with a setup.py build script.
-
Generate a distribution package.
-
Upload the zipped archive file to
/cc-home/_global_/config/conda
.curl -k -X PUT \ "${CPD_URL}/zen-volumes/cc-home/v1/volumes/files/%2F_global_%2Fconfig%2Fconda" \ -H "Authorization: ZenApiKey ${MY_TOKEN}" \ -H "content-type: multipart/form-data" \ -F upFile=@<package_name>
Note:To run this script, you must generate and export the token as the
${MY_TOKEN}
environment variable. For details, see Generating an API authorization token. -
Create an environment template in the project and add a customization for
pip
:# Add conda channels below defaults, indented by two spaces and a hyphen. channels: - nodefaults # Add conda packages here, indented by two spaces and a hyphen. dependencies: # Add pip packages here, indented by four spaces and a hyphen. # Remove the comments on the following lines and replace sample package name with your package name. - pip: - file:///cc-home/_global_/config/conda/Archive.zip
Installing pip
packages from project storage
You can add a custom Python distribution package to your project in Watson Studio and then add a configuration to your environment template with the file path to the data assets folder in the project.
-
Create a Python project with a setup.py build script.
-
Generate a distribution package.
-
Upload the zipped distribution file to your project as a data asset.
-
Create an environment template in your project and add a customization:
# Add conda channels below defaults, indented by two spaces and a hyphen. channels: - nodefaults # Add conda packages here, indented by two spaces and a hyphen. dependencies: # Add pip packages here, indented by four spaces and a hyphen. # Remove the comments on the following lines and replace sample package name with your package name. - pip: - file:///project_data/data_asset/Archive.zip
If your environment is air-gapped and you do not plan to use any conda
packages from external sources, create a .condarc
file in the same folder with the following content:
offline: True
This ensures that conda
filters out all channels that do not use the file://
protocol.
You can also include pip
configuration options in a conda
environment as pip
dependencies. An example is:
dependencies:
- pip:
- --index-url https://www.example.com/artifactory/token
- --trusted-host https://www.example.com
- langdetect
You can:
- Have a global
pip.conf
file - Specify
pip
options in a environment customization - Use both methods
Parent topic: Customizing environments