Import data into Cloud Pak for Data to be used by Watson Machine Learning Accelerator
Add data files to your WML Accelerator cluster in Cloud Pak for Data.
About this task
To add files, create a temporary pod using the wmla_pod_working.yaml file.
Procedure
- Obtain the wmla_pod_working.yaml file from https://github.com/IBM/wmla-assets/blob/master/dli-learning-path/movie-recommendation-use-case/wmla_pod_working.yaml
- Switch to the WML Accelerator namespace.
- Create a temporary pod using the wmla_pod_working.yaml file. Using
this file generates a pod named wmla-working-pod.
oc create -f wmla_pod_working.yaml - Verify that the wmla-working-pod pod is in
Runningstate.oc get po |grep wmla-working-pod - Log on to the pod.
oc exec -it wmla-working-pod bash - Source and activate conda environment.
bash-4.2# source /opt/anaconda3/bin/activate - Install wget and unzip.
- Install wget:
conda install wget - Install unzip:
conda install unzip
- Install wget:
- Go to dataset directory and download the dataset.
(base) bash-4.2# cd /gpfs/mydatafs/ (base) bash-4.2# wget https://github.com/IBM/wmla-assets/raw/master/dli-learning-path/datasets/pytorch-mnist-dataset.zip Will not apply HSTS. The HSTS database must be a regular and non-world-writable file. ERROR: could not open HSTS store at '/root/.wget-hsts'. HSTS will be disabled. --2021-03-30 20:42:25-- https://github.com/IBM/wmla-assets/raw/master/dli-learning-path/datasets/pytorch-mnist-dataset.zip Resolving github.com... 140.82.113.4 Connecting to github.com|140.82.113.4|:443... connected. HTTP request sent, awaiting response... 302 Found Location: https://raw.githubusercontent.com/IBM/wmla-assets/master/dli-learning-path/datasets/pytorch-mnist-dataset.zip [following] --2021-03-30 20:42:25-- https://raw.githubusercontent.com/IBM/wmla-assets/master/dli-learning-path/datasets/pytorch-mnist-dataset.zip Resolving raw.githubusercontent.com... 185.199.108.133, 185.199.109.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com|185.199.108.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 23006288 (22M) [application/zip] Saving to: 'pytorch-mnist-dataset.zip' pytorch-mnist-dataset.zip 100%[=======================================================>] 21.94M --.-KB/s in 0.1s 2021-03-30 20:42:25 (217 MB/s) - 'pytorch-mnist-dataset.zip' saved [23006288/23006288] - Unzip the dataset.
(base) bash-4.2# unzip pytorch-mnist-dataset.zip Archive: pytorch-mnist-dataset.zip (base) bash-4.2# ls -tlr total 22572 -rw-rw-rw-. 1 root root 23006288 Mar 30 20:42 pytorch-mnist-dataset.zip drwxr-xr-x. 3 1000820000 1000820000 4096 Mar 30 20:43 pytorch-mnistFrom Experiment Builder enter "pytorch-mnist" in your data path.