Run with Kubernetes

Running a simple deployment on Kubernetes offers an easy way to deploy a horizontally-scalable, static set of NLP models alongside your existing kubernetes workloads. The entire definition for the Watson NLP Runtime and a set of models to serve fits inside a single kubernetes deployment resource, and no other external dependencies are required. Through the use of initContainers, pretrained model images will extract model content into a volume shared with the runtime container without requiring external storage.

simple kubernetes deployment

Deploying a static set of models to Kubernetes

  1. Access the container images from your cluster:

    To allow your Kubernetes cluster to access the container images, use the methods from the Kubernetes documentation to store your credentials as a Kubernetes Secret. For example, use the following command to create a Secret named ibm-entitlement-key.

     kubectl create secret docker-registry ibm-entitlement-key --docker-server=cp.icr.io --docker-username=<your-name> --docker-password=<your-password> --docker-email=<your-email>
    

    where:

    • your-registry-server is cp.icr.io
    • your-name is your IBM Entitled Registry username
    • your-password is your IBM Entitled Registry password
    • your-email is your IBM Entitled Registry email address
  2. Deploy in Kubernetes:

    To run the service in a Kubernetes cluster, ensure that you have the Kubernetes CLI (kubectl) installed on your local machine, and that you have logged into the cluster. Further, ensure that the Docker image you created above is in a container registry that is accessible from your Kubernetes cluster.

    Below is an example of a YAML file to use to deploy on your cluster:

     apiVersion: apps/v1 
     kind: Deployment 
     metadata: 
       name: watson-nlp-container 
     spec: 
       selector: 
         matchLabels: 
           app: watson-nlp-container 
       replicas: 1 
       template: 
         metadata: 
           labels: 
             app: watson-nlp-container 
         spec: 
           initContainers:
           - name: english-syntax-model
             image: cp.icr.io/cp/ai/watson-nlp_syntax_izumo_lang_en_stock:1.0.9
             volumeMounts:
             - name: model-directory
               mountPath: "/app/models"
             env:
             - name: ACCEPT_LICENSE
               value: 'true'
           - name: english-tone-model
             image: cp.icr.io/cp/ai/watson-nlp_classification_ensemble-workflow_lang_en_tone-stock:1.0.9
             volumeMounts:
             - name: model-directory
               mountPath: "/app/models"
             env:
             - name: ACCEPT_LICENSE
               value: 'true'
           containers: 
           - name: watson-nlp-container 
             image: cp.icr.io/cp/ai/watson-nlp-runtime:1.1.0
             env:
             - name: ACCEPT_LICENSE
               value: "true"
             - name: LOCAL_MODELS_DIR
               value: "/app/models"
             resources: 
               requests: 
                 memory: "4Gi" 
                 cpu: "1000m" 
               limits: 
                 memory: "8Gi" 
                 cpu: "2000m"
             ports: 
             - containerPort: 8085 
             volumeMounts:
             - name: model-directory
               mountPath: "/app/models"
           imagePullSecrets:
           - name: ibm-entitlement-key
           volumes:
           - name: model-directory
             emptyDir: {}
     --- 
     apiVersion: v1 
     kind: Service 
     metadata: 
       name: watson-nlp-container 
     spec: 
       type: ClusterIP 
       selector: 
         app: watson-nlp-container 
       ports: 
       - port: 8085 
         protocol: TCP 
         targetPort: 8085
    
  3. Run on Kubernetes

    Run the below commands

     kubectl apply -f Runtime/deployment/deployment.yaml
    

    Check that the pod and service are running.

     kubectl get pods
    
     kubectl get svc
    

To see a tutorial that takes you through the steps to build a standalone container image to serve Watson NLP models and run it on a Kubernetes or OpenShift cluster, check out Serve Models on Kubernetes or OpenShift using Standalone Containers on GitHub.

Once you have your runtime server working, see Accessing client libraries and tools to continue.