Create online deployment API
You can use this API to create online deployment for a model.
HTTP method and URI path
POST /v3/published_models/${modelId}/deployments
Standard headers
Use the following standard HTTP header with this request:
Content-Type: application/json
Authorization: <Bearer token>
Required authorizations
The user ID associated with the token which is specified in the request header needs to be granted with one of the following roles:
- sysadm
- mladm
- apiuser (only if the model was created by the user)
Query parameter
| Parameter | Type | Required or optional | Description |
|---|---|---|---|
| sync | Boolean | Optional | Create online deployment in sync or async mode. The valid value is true or false. The mode is sync in default if this parameter is not set. |
INITIALIZING. You need to check the final status of the created online deployment with the Get deployment detail API. The status of a successfully created online deployment is ACTIVE. In sync mode, the response will be returned after the deploy action is completely done.Request body
| Parameter | Type | Required or optional | Description |
|---|---|---|---|
|
servingId |
String |
Optional |
The Note: All deployments that use the same
servingId must share an identical model schema, including both input and output formats. Additionally, only one deployment with a given servingId can be deployed within a specific scoringGroupId. |
|
type |
String | Required | Specify the deployment type: online or batch. |
| name | String | Required | A unique name of the deployment created. |
| description | String | Optional | Description of the deployment. |
| author | JSON object | Optional |
Specify the author information
|
| deploy_info | JSON object | Required |
Specify the online deployment information for deployment’s creation. deploy_info includes scoring group id, version sequence, engine type, and model version href.
|
Machine Learning for IBM z/OS® delivers exceptional throughput and performance for inferencing tasks. However, in rare scenarios, an inferencing request may exceed the expected duration. The scoring timeout option serves as a safeguard to automatically cancel such long-running inferencing requests.
Choose a reasonable timeout value. Setting a very low timeout may cause the majority of incoming inferencing requests to be rejected, leading to real-time online scoring failures.
Using a timeout introduces a minor performance overhead. Enable it only if necessary as a safeguard.
{
"type": "online",
"name": "CVT-online",
"description": "This is online deployment created for tests",
"author": {
"name": "John Smith",
"email": "john.smith@example.com"
},
"deploy_info": {
"scoringGroupId": "8da4702c-7636-4e0f-92ad-214b1493d50f",
"engineType": "spark",
"artifactVersionHref": "/v3/ml_assets/models/2e4d4282-cb64-4b27-b3c3-2b423cb0cc36/versions/208be2e7-5cb7-4f4a-a08a-5673ff1717fa"
}
}
Expected response
On completion, the service returns an HTTP response, which includes a status code that indicates whether your request is completed. Status code 201 indicates a successful completion. A submission ID should be returned.
{
"metadata": {
"url": "https://127.0.0.0:9999/v3/published_models/179e5d7e-05a8-4ec9-bf23-381456663565/deployments/17650863-0dc2-4b32-81de-7c6f407085f1",
"guid": "17650863-0dc2-4b32-81de-7c6f407085f1",
"modified_at": "2022-12-28T06:50:25.387Z",
"model_status": [],
"created_at": "2022-12-28T06:50:09.135Z"
},
"entity": {
"author": {
"email": "john.smith@example.com",
"name": "wmlz11"
},
"deploy_info": {
"artifactVersionHref": "/v3/ml_assets/models/179e5d7e-05a8-4ec9-bf23-381456663565/versions/4dd86b2c-c09e-4328-acc1-999a57ab09eb",
"engineType": "spark",
"nextFire": "0",
"scheduleStatus": "",
"scoringGroupId": "a9955c98-2a95-429c-9413-9c86edbe2017",
"versionSeq": "1"
},
"deployed_version": {
"guid": "4dd86b2c-c09e-4328-acc1-999a57ab09eb",
"url": "/v3/ml_assets/models/179e5d7e-05a8-4ec9-bf23-381456663565/versions/4dd86b2c-c09e-4328-acc1-999a57ab09eb"
},
"description": "This is online deployment created for tests",
"model_type": "mllib",
"name": "CVT-online",
"published_model": {
"author": {
"name": "wmlz11"
},
"created_at": "2022-12-23T04:57:16.375Z",
"description": "",
"guid": "179e5d7e-05a8-4ec9-bf23-381456663565",
"name": "churn",
"url": "https://127.0.0.0:9999/v3/published_models/179e5d7e-05a8-4ec9-bf23-381456663565"
},
"runtime_environment": "spark",
"scoring_url": "https://127.0.0.0:15779/iml/v2/scoring/online/17650863-0dc2-4b32-81de-7c6f407085f1",
"status": "ACTIVE",
"type": "online"
}
}
HTTP status codes
For unsuccessful requests, the service returns the status codes that are described in Table 3.
| HTTP status code | Error response | Description |
|---|---|---|
| 400 | empty_deploy_info | No deploy_info is provided. Specify deploy_info. |
| 400 | parsing_error | No type is provided. Specify type. |
| 400 | not_supported_deployment_type | Invalid type is provided. Specify type as online or batch. |
| 400 | duplicate_deployment_name | Specify a unique online deployment name. |
| 400 | duplicate_deployment | It means the model of this version has been deployed in the scoring service. Specify another version of this model or another scoring service. |
| 400 | timeout_not_supported | The timeout value must be an integer greater than 0 and is supported only for the online deployment type with the PMML engine only. |
| 500 | not_found | artifactVersionHref is not valid. Specify the correct model version href. |