Run with Kubernetes and Knative Serving
This topic walks you through the steps to serve pretrained Watson NLP models using Knative Serving in a RedHat OpenShift cluster.
Create a Knative Service to run the Watson NLP Runtime. Pods of this Knative Service specify Watson NLP pretrained model images as init containers. These init containers run to completion before the main application starts
in the pod. They will provision models to the emptyDir volume of the pod. When the Watson NLP Runtime container starts, it loads the models and begins serving them.
Using this approach allows for models to be kept in separate container images from the runtime container image. To change the set of served models you need only update the Knative Service Manifest.

Prerequisites
-
Install Docker Desktop
-
Ensure that you have access to an OpenShift Container Platform account with cluster administrator access. Follow the instructions below to install Knative Serving in your own cluster
-
Install the OpenShift Serverless Operator
-
Install Knative Serving
-
-
Install the Red Hat OpenShift CLI (
oc) and log in to the OpenShift cluster -
Create a Docker registry secret in the Kubernetes project that grants access to the Watson NLP Runtime and pretrained models
Step 1. Configure Knative
Configure Knative to enable init containers and empty directories.
Save the config-features configuration map in your current directory.
oc get configmap/config-features -n knative-serving -o yaml > config-feature.yaml
Modify the configuration with your favorite editor by adding the following lines in the data section; do not modify any other section or content.
apiVersion: v1
data:
kubernetes.podspec-init-containers: enabled
kubernetes.podspec-volumes-emptydir: enabled
Now, apply the configuration.
oc apply -f config-feature.yaml
Step 2. Deploy the model service
Create a Knative service to run the Watson NLP Runtime. When a Service is created, Knative does the following:
-
Creates a new immutable revision for this version of the application.
-
Creates a Route, Ingress, Service, and Load Balancer for your application.
-
Automatically scales replicas based on request load, including scaling to zero active replicas.
To create the Knative service, run the following example command:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: watson-nlp-kn
spec:
template:
metadata:
annotations:
queue.sidecar.serving.knative.dev/resourcePercentage: "10"
spec:
initContainers:
- name: ensemble-workflow-lang-en-tone-stock
image: cp.icr.io/cp/ai/watson-nlp_classification_ensemble-workflow_lang_en_tone-stock:1.4.1
volumeMounts:
- name: model-directory
mountPath: "/app/models"
env:
- name: ACCEPT_LICENSE
value: 'true'
resources:
requests:
memory: "100Mi"
cpu: "100m"
limits:
memory: "200Mi"
cpu: "200m"
containers:
- name: watson-nlp-runtime
image: cp.icr.io/cp/ai/watson-nlp-runtime:1.1.36
env:
- name: ACCEPT_LICENSE
value: 'true'
- name: LOCAL_MODELS_DIR
value: "/app/models"
- name: LOG_LEVEL
value: debug
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "4Gi"
cpu: "2"
ports:
- containerPort: 8080
volumeMounts:
- name: model-directory
mountPath: "/app/models"
imagePullSecrets:
- name: watson-nlp
volumes:
- name: model-directory
emptyDir: {}
Verify that the service has been created:
oc get configuration
You should see output similar to the following:
NAME LATESTCREATED LATESTREADY READY REASON
watson-nlp-kn watson-nlp-kn-00001 watson-nlp-kn-00001 True
To check the revisions of this service:
oc get revisions
Set the URL for the service in an environment variable.
export SERVICE_URL=$(oc get ksvc watson-nlp-kn -o jsonpath="{.status.url}")
Step 3. Test Knative autoscaling
With the parameters used when creating the service, Knative will autoscale pods based on requests including scaling to zero when there are no requests.
Run the following command to list the pods in your OpenShift Project:
oc get pods
Pods belonging to the Knative service should have the prefix watson-nlp-kn. Initially, there should be none; if you do see any, then wait for a minute or two and they will be automatically terminated.
Run the following command to trigger the Knative service to start up pods:
curl ${SERVICE_URL}
Use ctrl-c to break out of the command.
You can watch the pods being created in response to the request, and then later being terminated, using the following command:
oc get pods -w
The output will be similar to the following:
NAME READY STATUS RESTARTS AGE
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 0/2 Init:0/1 0 15s
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 0/2 PodInitializing 0 75s
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 1/2 Running 0 76s
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 2/2 Running 0 2m
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 2/2 Terminating 0 3m
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 1/2 Terminating 0 3m20s
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 1/2 Terminating 0 3m30s
watson-nlp-kn-00001-deployment-6f8b5d7494-cdvqb 0/2 Terminating 0 3m32s
Use ctrl-c to break out of the command.
Step 4. Test the service
Make an inference request on the model using the REST interface. Exceute the following command.
curl -X POST "${SERVICE_URL}/v1/watson.runtime.nlp.v1/NlpService/ClassificationPredict" -H "accept: application/json" -H "grpc-metadata-mm-model-id: classification_ensemble-workflow_lang_en_tone-stock" -H "content-type: application/json" -d "{ \"rawDocument\": { \"text\": \"Watson nlp is awesome! works in knative\" }}" | jq
You will see output similar to the following.
{
"classes": [
{
"className": "satisfied",
"confidence": 0.6308287
},
{
"className": "excited",
"confidence": 0.5176963
},
{
"className": "polite",
"confidence": 0.3245624
},
{
"className": "sympathetic",
"confidence": 0.1331128
},
{
"className": "sad",
"confidence": 0.023583649
},
{
"className": "frustrated",
"confidence": 0.0158445
},
{
"className": "impolite",
"confidence": 0.0021891927
}
],
"producerId": {
"name": "Voting based Ensemble",
"version": "0.0.1"
}
}
Other Resources
To see a tutorial that takes you through the steps to deploy a Watson NLP model to the Knative Serving sandbox environment on IBM Technology Zone (TechZone), check out Watson NLP - Serve Models with Kubernetes or OpenShift.