Managing deployed models

After you deploy your custom foundation models tuned with a PEFT technique, you can manage the deployment by updating details, scaling, or deleting the deployment programmatically.

Managing PEFT model deployment with REST API

You can use the watsonx.ai REST API to manage the deployment of the custom foundation model and LoRA adapter depoyment. The process for retrieving, updating, and deleting a deployment is the same for both custom foundation model and LoRA adapter deployments.

Retrieving the list of deployments

To retrieve the list of deployments in the specified project or deployment space, send a GET requuest to the /ml/v4/deployments endpoint and provide a project or space ID.

To filter all deployments pointing to the custom foundation model, set the value of the type parameter to custom_foundation_model, as shown in the following code sample:

curl -X GET "https://<HOST>/ml/v4/deployments?version=2024-01-29&space_id=<space_id>&type=custom_foundation_model" \
-H "Authorization: Bearer <token>" 

To filter all deployments pointing to the custom foundation model, set the value of the type parameter to lora_adapter, as shown in the following code sample:

curl -X GET "https://<HOST>/ml/v4/deployments?version=2024-01-29&space_id=<space_id>&type=fine_tune" \
-H "Authorization: Bearer <token>" 

Updating the deployment

To update the required deployment metadata fields by using paths like /name, /description, /tags, /custom, /asset, /online/parameters and /online/parameters/serving_name, use the following code sample:

curl -X PATCH "https://<HOST>//ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" \
-H "Authorization: Bearer <token>" \
-H "content-type: application/json" \
--data '[{
 "op": "replace",
 "path": "/name",
 "value": "<updated deployment name>"
}]'

Deleting the deployment

Before deleting the custom foundation model deployment, you must delete the associated LoRA or QLoRA adapter deployments. After the LoRA or QLoRA adapter deployment is deleted, you can delete the custom foundation model deployment.

Deleting a LoRA adapter deployment will only delete the specific LoRA adapter deployment and will not affect the custom foundation model deployment.

To delete the custom foundation model deploymnet or LoRA or QLoRA deployment, you can use the /deployments/{deployment_id} REST API endpoint:

curl -vk -X DELETE "https://<HOST>/ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" -H "Authorization: Bearer <token>"

Scaling the deployment

Scale the deployment to increase the number of copies for your deployment:

curl -X PATCH "https://<HOST>//ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" \
-H "Authorization: Bearer <token>" \
-H "content-type: application/json" \
--data '[{
 "op": "replace",
 "path": "/hardware_spec",
 "value": {
     "name": "WX-S",
     "num_nodes": 2
 }
}]'

Parent topic: Deploying fine-tuned custom foundation models