Managing deployed models
After you deploy your custom foundation models tuned with a PEFT technique, you can manage the deployment by updating details, scaling, or deleting the deployment programmatically.
Managing PEFT model deployment with REST API
You can use the watsonx.ai REST API to manage the deployment of the custom foundation model and LoRA adapter depoyment. The process for retrieving, updating, and deleting a deployment is the same for both custom foundation model and LoRA adapter deployments.
Retrieving the list of deployments
To retrieve the list of deployments in the specified project or deployment space, send a GET requuest to the /ml/v4/deployments
endpoint and provide a project or space ID.
To filter all deployments pointing to the custom foundation model, set the value of the type
parameter to custom_foundation_model
, as shown in the following code sample:
curl -X GET "https://<HOST>/ml/v4/deployments?version=2024-01-29&space_id=<space_id>&type=custom_foundation_model" \
-H "Authorization: Bearer <token>"
To filter all deployments pointing to the custom foundation model, set the value of the type
parameter to lora_adapter
, as shown in the following code sample:
curl -X GET "https://<HOST>/ml/v4/deployments?version=2024-01-29&space_id=<space_id>&type=fine_tune" \
-H "Authorization: Bearer <token>"
Updating the deployment
To update the required deployment metadata fields by using paths like /name
, /description
, /tags
, /custom
, /asset
, /online/parameters
and /online/parameters/serving_name
,
use the following code sample:
curl -X PATCH "https://<HOST>//ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" \
-H "Authorization: Bearer <token>" \
-H "content-type: application/json" \
--data '[{
"op": "replace",
"path": "/name",
"value": "<updated deployment name>"
}]'
Deleting the deployment
Before deleting the custom foundation model deployment, you must delete the associated LoRA or QLoRA adapter deployments. After the LoRA or QLoRA adapter deployment is deleted, you can delete the custom foundation model deployment.
Deleting a LoRA adapter deployment will only delete the specific LoRA adapter deployment and will not affect the custom foundation model deployment.
To delete the custom foundation model deploymnet or LoRA or QLoRA deployment, you can use the /deployments/{deployment_id}
REST API endpoint:
curl -vk -X DELETE "https://<HOST>/ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" -H "Authorization: Bearer <token>"
Scaling the deployment
Scale the deployment to increase the number of copies for your deployment:
curl -X PATCH "https://<HOST>//ml/v4/deployments/<deployment_id>?version=2024-01-29&space_id=<space_id>" \
-H "Authorization: Bearer <token>" \
-H "content-type: application/json" \
--data '[{
"op": "replace",
"path": "/hardware_spec",
"value": {
"name": "WX-S",
"num_nodes": 2
}
}]'
Parent topic: Deploying fine-tuned custom foundation models