Setting up storage and uploading a model that is located in your environment
Follow these steps to upload a foundation model that is located in your own environment to PVC storage.
Prerequisites:
Some of the environmental variables that are used in code snippets in this document are global variables that are set at the Cloud Pak for Data installation stage. For more information, see Setting up installation environment variables.
To set up storage and upload a model that is located in your environment:
-
Set up basic environment variables:
export MODEL_PATH="Path to the directory where your model is located" -
Check the model size. If you have Git Large File Storage (
git lfs) installed in your environment, navigate to the folder that contains your model and then check the model size:cd ${MODEL_PATH} git lfs ls-files -sExample output:
root@wmlubntu1:~/falcon-7b# git lfs ls-files -s 1c4b989693 - pytorch_model-00001-of-00002.bin (10 GB) 11822397cd - pytorch_model-00002-of-00002.bin (4.5 GB)If you don't have
git lfsinstalled in your environment, you must obtain this information from the model builder or the repository where the model was originally downloaded from. In case of models that are located on Hugging Face, all file sizes are visible when you click on the model name and then select the Files and versions tab. -
Calculate the total size of the model and add a 100% buffer to the result. For example, if the model size is 14.5 GB, the size of the PVC that you must create is 29 GB. Set the model size as an environment variable:
export MODEL_SIZE="<calculated model size>" -
Check the storage class for your PVC.
-
Create the
PersistentVolumeClaims(PVC) storage for your custom foundation model.First, set the name of the new PVC as an environment variable:
export PVC_NAME="<The name of the new PVC>"Next, run this code:
cat <<EOF |oc apply -f - apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ${PVC_NAME} namespace: ${PROJECT_CPD_INST_OPERANDS} spec: storageClassName: ${STG_CLASS_FILE} accessModes: - ReadWriteMany resources: requests: storage: ${MODEL_SIZE}Gi EOFTo learn more about creating a new PVC or using an existing PVC with storage volumes, see Managing storage volumes.
-
After creating the PVC, wait for two minutes and then run this command to verify that the PVC is bounded:
oc get pvc ${PVC_NAME} -n ${PROJECT_CPD_INST_OPERANDS} -o jsonpath='{.status.phase}'Expected result:
Bound -
Create a custom standalone job to copy the model content from source location to the PVC path.
Here is an example of copying a custom model to the PVC. The example assumes that the model is in an IBM COS bucket:
First, create the COS secret:
oc create secret generic aws-credentials \ --from-literal=AWS_ACCESS_KEY_ID=<your access key id> \ --from-literal=AWS_SECRET_ACCESS_KEY=<your secret access key>Then, set up the necessary environment variables:
export ENDPOINT="<Endpoint for the s3 bucket>" export BUCKET_NAME="<Name of the bucket where the model is located>" export MODEL_PATH_IN_BUCKET="<Model path in bucket>" export COPY_JOB_NAME="<Name of the job that copies the model from COS>"Next, create and run a job that copies models from the IBM COS bucket to your PVC
cat <<EOF |oc apply -f - apiVersion: batch/v1 kind: Job metadata: name: ${COPY_JOB_NAME} spec: template: spec: containers: - name: aws-cli image: amazon/aws-cli:latest command: ["sh","-c"] args: - aws --endpoint-url ${ENDPOINT} s3 cp s3://${BUCKET_NAME}/${MODEL_PATH_IN_BUCKET} /model --recursive env: - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: aws-credentials key: AWS_ACCESS_KEY_ID - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: aws-credentials key: AWS_SECRET_ACCESS_KEY - name: BUCKET_NAME value: ${BUCKET_NAME} - name: ENDPOINT value: ${ENDPOINT} - name: MODEL_PATH value: ${MODEL_PATH_IN_BUCKET} volumeMounts: - name: pvc-mount mountPath: /model restartPolicy: Never volumes: - name: pvc-mount persistentVolumeClaim: claimName: ${PVC_NAME} EOF -
Verify that the job was created:
oc get job ${COPY_JOB_NAME} -n ${PROJECT_CPD_INST_OPERANDS}Expected output:
NAME COMPLETIONS DURATION AGE <job name> 1/1 xx xx -
Check the job status:
oc get job ${COPY_JOB_NAME} -n ${PROJECT_CPD_INST_OPERANDS} -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}'Expected output:
True -
Optional: Create and run a job that converts your model to
safetensorandfast-tokenizerformats. You can skip this step if you're sure that your model already meets these criteria. The fastest way to check that is to open your model repository and check whether it contains thetokenizer.jsonand.safetensorsfiles.First, set the
nameandtagof the TGIS digest as environment variables. A digest is a unique identifier for a Docker image. It is computed based on the contents of the image:export TGIS_IMAGE_NAME="quay.io/modh/text-generation-inference" export TGIS_IMAGE_TAG="rhoai-2.8-58cac74"Then, set up an environment variable for your job name
export CONV_JOB_NAME="<Name of the job>"Next, create and run the converter job:
cat <<EOF |oc apply -f - apiVersion: batch/v1 kind: Job metadata: name: ${CONV_JOB_NAME} namespace: ${PROJECT_CPD_INST_OPERATORS} spec: template: spec: containers: - name: models-safetensor image: ${TGIS_IMAGE_NAME}:${TGIS_IMAGE_TAG} env: - name: MODEL_PATH value: /model command: ["/bin/sh", "-c"] args: - | text-generation-server convert-to-safetensors ${MODEL_PATH} text-generation-server convert-to-fast-tokenizer ${MODEL_PATH} volumeMounts: - mountPath: /model name: byom-model restartPolicy: Never volumes: - name: byom-model persistentVolumeClaim: claimName: ${PVC_NAME} EOFVerify that the job was created:
oc get job ${CONV_JOB_NAME} -n ${PROJECT_CPD_INST_OPERATORS}Expected output:
NAME COMPLETIONS DURATION AGE <job name> 1/1 xx xxFinally, check the job status:
oc get job ${CONV_JOB_NAME} -n ${PROJECT_CPD_INST_OPERATORS} -o jsonpath='{.status.conditions[?(@.type=="Complete")].status}'Expected output:
True
Parent topic: Setting up storage and uploading the model