Run with Kubernetes
Running a simple deployment on Kubernetes offers an easy way to deploy a horizontally-scalable, static set of NLP models alongside your existing kubernetes workloads. The entire definition for the Watson NLP Runtime and a set of models to serve fits inside a single kubernetes deployment resource, and no other external dependencies are required. Through the use of initContainers, pretrained model images will extract model content into a volume shared with the runtime container without requiring external storage.
Deploying a static set of models to Kubernetes
-
Access the container images from your cluster:
To allow your Kubernetes cluster to access the container images, use the methods from the Kubernetes documentation to store your credentials as a Kubernetes Secret. For example, use the following command to create a Secret named
ibm-entitlement-key
.kubectl create secret docker-registry ibm-entitlement-key --docker-server=cp.icr.io --docker-username=<your-name> --docker-password=<your-password> --docker-email=<your-email>
where:
your-registry-server
iscp.icr.io
your-name
is your IBM Entitled Registry usernameyour-password
is your IBM Entitled Registry passwordyour-email
is your IBM Entitled Registry email address
-
Deploy in Kubernetes:
To run the service in a Kubernetes cluster, ensure that you have the Kubernetes CLI (kubectl) installed on your local machine, and that you have logged into the cluster. Further, ensure that the Docker image you created above is in a container registry that is accessible from your Kubernetes cluster.
Below is an example of a YAML file to use to deploy on your cluster:
apiVersion: apps/v1 kind: Deployment metadata: name: watson-nlp-container spec: selector: matchLabels: app: watson-nlp-container replicas: 1 template: metadata: labels: app: watson-nlp-container spec: initContainers: - name: english-syntax-model image: cp.icr.io/cp/ai/watson-nlp_syntax_izumo_lang_en_stock:1.0.9 volumeMounts: - name: model-directory mountPath: "/app/models" env: - name: ACCEPT_LICENSE value: 'true' - name: english-tone-model image: cp.icr.io/cp/ai/watson-nlp_classification_ensemble-workflow_lang_en_tone-stock:1.0.9 volumeMounts: - name: model-directory mountPath: "/app/models" env: - name: ACCEPT_LICENSE value: 'true' containers: - name: watson-nlp-container image: cp.icr.io/cp/ai/watson-nlp-runtime:1.1.0 env: - name: ACCEPT_LICENSE value: "true" - name: LOCAL_MODELS_DIR value: "/app/models" resources: requests: memory: "4Gi" cpu: "1000m" limits: memory: "8Gi" cpu: "2000m" ports: - containerPort: 8085 volumeMounts: - name: model-directory mountPath: "/app/models" imagePullSecrets: - name: ibm-entitlement-key volumes: - name: model-directory emptyDir: {} --- apiVersion: v1 kind: Service metadata: name: watson-nlp-container spec: type: ClusterIP selector: app: watson-nlp-container ports: - port: 8085 protocol: TCP targetPort: 8085
-
Run on Kubernetes
Run the below commands
kubectl apply -f Runtime/deployment/deployment.yaml
Check that the pod and service are running.
kubectl get pods
kubectl get svc
To see a tutorial that takes you through the steps to build a standalone container image to serve Watson NLP models and run it on a Kubernetes or OpenShift cluster, check out Serve Models on Kubernetes or OpenShift using Standalone Containers on GitHub.
Once you have your runtime server working, see Accessing client libraries and tools to continue.