IBM Support

How to migrate notebooks to the new project by using CPDCTL command tool

Question & Answer


Question

We had this situation during the migration upgrade to the newer version of CP4D. To consolidate the project assets with numerous notebooks on each project, we exported those notebooks as ipynb files and put them into specific directories. And then we create new projects and add those notebooks to the specific projects. Could you tell me how to automate the process to import those notebooks to the new project by using CPDCTL command tool?

Cause

Since there are a huge number of notebooks we need to add to the new project, importing notebook files one by one in CP4D UI is not a choice. We need to use CPDCTL command tool to automate the process. 

Answer

According to the CPDCTL documentation, importing notebooks to the existing project requires 2 steps:
1. Uploading notebook content in ipynb format.
cpdctl assets files upload --path <your remote path to the notebook content> --file <your local path to the notebook content> --project-id <your project id>
2. Create a notebook by referencing the notebook content with the attribute file-reference.
cpdctl notebooks create --name <your notebook name> --project-id <your project id> --file-reference <your remote path to the notebook content> --runtime '{"environment": "<your environment id>"}'
To automate the process to import number of notebook files in the bash shell, we can run the cpdctl command within a loop.
cd <notebook-folder>
for f in *; do
   cpdctl asset file upload --path notebook/$f --file $f --project-id <project_id> --file-content-type notebook
   cpdctl notebook create --file-reference notebook/$f --name ${f%_*} --project-id <project_id>
done

[REFERENCE]
The following is a sample step by step on exporting and importing notebooks by using CPDCTL command tool.
1. Download CPDCTL from GitHub: https://github.com/IBM/cpdctl/releases/
image-20231027102705-1
# wget https://github.com/IBM/cpdctl/releases/download/v1.4.42/cpdctl_linux_amd64.tar.gz
…
2023-09-24 10:21:48 (32.1 MB/s) - ‘cpdctl_linux_amd64.tar.gz’ saved [11755128/11755128]
# ls
cpdctl_linux_amd64.tar.gz
2. Extract the downloaded file and copy the cpdctl to the "/usr/bin/" directory.
# tar xvzf cpdctl_linux_amd64.tar.gz
cpdctl
LICENSE
NOTICES
# ls
cpdctl  cpdctl_linux_amd64.tar.gz  LICENSE  NOTICES
# cp cpdctl /usr/bin
3. Set the connection to the CP4D cluster.
# cpdctl config user set cpd_user --username cpduser02 --password MySPSS1234
# cpdctl config profile set cpd --url https://cpd-zen.apps.cpd4s.cp.fyre.ibm.com/ --user cpd_user
# cpdctl config profile list
Name   Type      User       URL                                           IBM Cloud CLI config   Region   Current
cpd    private   cpd_user   https://cpd-zen.apps.cpd4s.cp.fyre.ibm.com/      
4. List projects on the CP4D cluster
# cpdctl config profile use cpd
Switched to profile "cpd".
# cpdctl project list
...
ID                                     Name                   Created                    Description
f04dde47-cff9-4b48-a71a-bc557e3a1b04   TESTCASE               2022-11-21T01:05:38.984Z
de464937-fc47-47ef-b95d-94c3eeb56d87   TESTCASE2              2023-01-19T08:46:35.118Z
5. Export a project
# cpdctl asset export start --project-id f04dde47-cff9-4b48-a71a-bc557e3a1b04 --assets "{\"all_assets\": true}" --name "TESTCASE"
...
ID:        7ed5222b-d002-4fdd-b0e2-56e0de51bc12
Name:      TESTCASE
Created:   2023-09-24T01:54:09.101Z
State:     completed
6. Download exported project.
# cpdctl asset export download --export-id 7ed5222b-d002-4fdd-b0e2-56e0de51bc12 --project-id f04dde47-cff9-4b48-a71a-bc557e3a1b04 --output-file TESTCASE.zip
...
OK
Output written to TESTCASE.zip
# ls
TESTCASE.zip
7. Check the notebook files in the downloaded project.
# unzip TESTCASE.zip
Archive:  TESTCASE.zip
  inflating: deflate.log
  inflating: project.json
  inflating: assettypes/ibm_logical_model_attribute.json
  inflating: assettypes/data_lineage.json
  inflating: assettypes/wml_training_definition.json
  inflating: assettypes/do_solve_asset.json
  inflating: assettypes/omrs_relationship.json
  inflating: assettypes/data_transformation.json
…
# ls
assets  assettypes  deflate.log  project.json  TESTCASE.zip
# cd assets; ls
data_asset  data_flow  machine-learning-stream  notebook  package_extension  wml_model
# cd notebook; ls
DeleteAsset_I8B0TDh7S.ipynb                Project-backup_9VnYi3sKr.ipynb                          sql_server_test_Gw-YdLGEy.ipynb  TestVolumeStorage_ejyGoPLje.ipynb
DE_Pyspark_code_9Sadb50uk_ym9u8KM8Z.ipynb  RPlotty_W2ckZ2UXE.ipynb                                 Test0001_oGKr7AJ4s.ipynb         TimeZoneTest_jc2aUMCBu.ipynb
MinioBoto3_3VfeWEp6U.ipynb                 Spark1_2dWTgHPIN.ipynb                                  TestHugeCSV_raRF4QiNT.ipynb      UnicodeTest_YAg0yKcIM.ipynb
NZTest_kSH4N9Z3U.ipynb                     Spark2_eo2y3XPmC.ipynb                                  TestMSIL_J7-wuKwTr.ipynb
PlotlyTest_gWTYDsos3.ipynb                 Spark-Flight-write-batch_size-gRPCerr1_G6yxsKJB5.ipynb  TestProphet_twAyWfmoM.ipynb
8. Create a project.
image-20231027134121-1
9. Check the new created project ID.
# cpdctl project list
...
ID                                     Name                         Created                    Description
f04dde47-cff9-4b48-a71a-bc557e3a1b04   TESTCASE                     2022-11-21T01:05:38.984Z
de464937-fc47-47ef-b95d-94c3eeb56d87   TESTCASE2              2023-01-19T08:46:35.118Z
63053ff1-d128-48e8-baeb-f3fa40117502   TESTUPLOAD                   2023-10-24T01:37:09.646Z
10. Upload notebook (ipynb) files to the new created TESTUPLOAD project
# for f in *; do
>    cpdctl asset file upload --path notebook/$f --file $f --project-id 63053ff1-d128-48e8-baeb-f3fa40117502 --file-content-type notebook
>    cpdctl notebook create --file-reference notebook/$f --name ${f%_*} --project-id 63053ff1-d128-48e8-baeb-f3fa40117502
> done
...
OK
...

ID:        ef0a438c-c8ff-4574-b8a6-33348c8bb091
Name:      DE_Pyspark_code_9Sadb50uk
Created:   2023-10-24T07:18:12Z
Type:      notebook
...
OK
...

ID:        8603fe21-d982-4f40-af1f-a232270a231c
Name:      DeleteAsset
Created:   2023-10-24T07:18:14Z
Type:      notebook
...
OK
...
11. Confirm all of the notebook files uploaded to the project.
image-20231027134549-2
12. Change the notebook's environment.
image-20231027134647-3
13. Confirm you can edit the notebook and run it successfully.
image-20231027134849-4[NOTES]
As described in the CPDCTL documentation, you can assign the notebook runtime during uploading the notebook files by setting the --runtime parameter on the cpdctl notebook create command.

[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSGU851","label":"IBM Watson Studio for IBM Cloud Pak for Data"},"ARM Category":[{"code":"a8m3p000000UoRHAA0","label":"Administration-\u003EProjects"}],"ARM Case Number":"TS014454794","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
27 October 2023

UID

ibm17060753