IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.
Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.
Post-installation tasks for the Watson Studio service
To finish setting up the Watson Studio service after installation, complete the appropriate tasks.
Optional tasks
You can perform the following optional tasks to enhance Watson Studio. You must have the appropriate permissions on the OpenShift cluster.
| Task | User role |
|---|---|
| Set the scaling for the service | Project administrator |
| Set a new limit for the number of projects each user can create | Project administrator |
| Set the time zone for the master node | System administrator |
| Enable Visual Studio Code support | Project and System administrator |
| Installing pre-trained NLP models for Python-based notebook runtimes | Project administrator |
| Using Livy to connect to a Spark cluster | System administrator |
To set a new limit for the number of projects each user can create
By default, there is a limit of 200 projects that each user can create. You can increase or decrease this limit by specifying a new limit value in the Common Core Services operator as a Number or String. Replace <project_limit> with the new limit value, and run:
oc patch ccs ccs-cr -n ${CPD_INSTANCE_PROJECT} --type merge --patch '{"spec":{"projects_created_per_user_limit": <project_limit>}}'
To set the time zone after installing your service
If the service will be installed on a remote machine that runs in a different time zone than the master node, the time zone for the master node is overwritten by the time zone for the installer node. This time zone discrepancy results in scheduled jobs that don’t run at the correct time.
-
Edit the timezone configmap, and then change the time zone string to the cluster time zone.
-
Modify
data.masterTimezonein the configmap and use the following command:oc edit configmap timezoneand add the tz database code format associated with the master node time zone.
Note: If you are using Red Hat OpenShift Container Platform 4.x, use the Coordinated Universal Time (UTC) time zone as the value of the `data.masterTimezone` in the configmap. -
If there is a pre-existing schedule, go to the Job details page in the UI and edit the schedule once to pick up the updated time zone.
To enable Visual Studio Code support
VS Code support is not enabled by default, because it has additional requirements regarding storage. Support for VS Code can be enabled by patching the ws CR:
oc patch -n ${PROJECT_CPD_INST_OPERANDS} ws/ws-cr --type=merge --patch '{"spec":{"tools":{"enable_vscode":true,"storage_size":"5Gi"}}}'
To check if patching is finished, run this command:
cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
--components=ws
It should take between 5 to 10 minutes for all required changes to be performed.
Updating existing custom runtime definitions (earlier than 4.7)
If you are using custom images or a custom runtime definition for JupyterLab, update all manually created runtime definitions to add the following entries to the volumes field
{
"volume": "tools-data",
"mountPath": "/tools-data",
"claimName": "tools-data-pvc",
"subPath": "users/$user_id",
"optional": true
},
{
"volume": "axshelld",
"type": "secret",
"mountPath": "/etc/cp4d/keys",
"secret": {
"defaultMode": 420,
"secretName": "ax-shelld-secret"
},
"optional": true
}
Update the runtime definitions using the Cloud Pak for Data API:
-
To generate the required platform access token see Generating an API authorization token.
-
List the IDs and names of all available runtime definitions on the cluster:
curl -k -X GET -H "Authorization: ZenApiKey ${TOKEN}" "$cpd_url/v2/runtime_definitions" | jq -r '.resources[] | .metadata.guid + " " + .entity.name' -
Find the IDs and names of the runtime definition that need to be changed, then patch the content of the runtime definition and store it as a JSON file:
myRuntimeDefinition=runtime-22.1-py3.9 myRuntimeDefinitionID=0d76d114-c66d-5216-8442-2461b84af0da curl -k -X GET -H "Authorization: ZenApiKey ${TOKEN}" "$cpd_url/v2/runtime_definitions/$myRuntimeDefinitionID?include=launch_configuration" | jq '.entity.launch_configuration.volumes |= . + [{"volume": "tools-data","mountPath": "/tools-data","claimName": "tools-data-pvc","subPath": "users/$user_id","optional": true}, {"volume": "axshelld","type": "secret","mountPath": "/etc/cp4d/keys","secret": {"defaultMode": 420,"secretName": "ax-shelld-secret"},"optional": true}]' -
Update the runtime definition by using the Cloud Pak for Data API:
curl -k -X PUT -H "Authorization: ZenApiKey ${TOKEN}" -H "Content-Type:application/json" "$cpd_url/v2/runtime_definitions/$myRuntimeDefinitionID" -d @./$myRuntimeDefinition.json
Installing pre-trained NLP models for Python-based notebook runtimes
Runtime 23.1 on Python 3.10 is installed by default with Watson Studio on Cloud Pak for Data 4.8.
You can optionally install the pre-trained NLP models for the Watson Natural Language Processing library by running the following command:
oc patch -n ${PROJECT_CPD_INST_OPERANDS} NotebookRuntime ibm-cpd-ws-runtime-231-py --type=merge --patch '{"spec":{"install_nlp_models":true}}'
Pre-trained NLP models are also supported with these optional runtimes:
- ibm-cpd-ws-runtime-231-pygpu
- ibm-cpd-ws-runtime-222-py
- ibm-cpd-ws-runtime-222-pygpu
The pre-trained NLP models are large and take time to install. Use the following command to check the status of the notebook runtimes:
oc get -n ${PROJECT_CPD_INST_OPERANDS} NotebookRuntime
The pre-trained NLP models are available only when the status column for the notebook runtimes changes to Completed.
Using Livy to connect to a Spark cluster
If you need to use livy to connect to a Spark cluster that is FIPS-enabled, you must load the digest package prior to loading the sparklyr package. To load the digest package, run the following command:
library(digest, lib.loc='/opt/not-FIPS-compliant/R/library')
library(sparklyr)
Parent topic: IBM Watson Studio