Run with Docker run

One of the most common uses for watson_runtime is to encode a static set of models into the server that will be loaded at boot time and served as a standalone grpc server. The models can be either one of the pretrained models listed in the models catalog that are already available in Watson NLP, or custom models that have been trained for a specific use case.

The runtime exposes both a gRPC server as well as a REST server. The REST server is created using the gRPC Gateway wrapper, which adds a REST gateway wrapper layer for a gRPC server with minimal customization, along with swagger definitions conforming to OpenAPI 2.0.

By default, the image is run with gRPC on port 8085 and REST Gateway on port 8080, but the ports need to be explicitly exposed. To expose the desired port, add -p [port]:[port] to the Docker run command, with port being 8080 for REST, and 8085 for GRPC.

These steps highlight how to run the Watson NLP Runtime container locally with Docker, using a volume populated with pretrained models.

  1. Login to the IBM Entitled Registry

    Container images for Watson NLP Runtime and pretrained model images are stored in the IBM Entitled Registry. Once you've obtained the entitlement key from the container software library you can login to the registry with the key, and pull the images to your local machine.

    docker login cp.icr.io --username cp --password <entitlement_key>
    
  2. Run the following script. Copy and paste the below shell script commands into a file named start_runtime_server.sh and save that file. Change its file attribute to executable and then run that shell script.

    Note: This script creates a Docker volume.

    #!/usr/bin/env bash
    IMAGE_REGISTRY=${IMAGE_REGISTRY:-"cp.icr.io/cp/ai"}
    RUNTIME_IMAGE=${RUNTIME_IMAGE:-"watson-nlp-runtime:1.1.36"}
    export MODELS="${MODELS:-"watson-nlp_syntax_izumo_lang_en_stock:1.4.1,watson-nlp_syntax_izumo_lang_fr_stock:1.4.1"}"
    IFS=',' read -ra models_arr <<< "${MODELS}"
    TLS_CERT=${TLS_CERT:-""}
    TLS_KEY=${TLS_KEY:-""}
    CA_CERT=${CA_CERT:-""}
    
    function real_path {
      echo "$(cd $(dirname ${1}) && pwd)/$(basename ${1})"
    }
    
    # Clear out existing volume
    docker volume rm model_data 2>/dev/null || true
    
    # Create a shared volume and initialize with open permissions
    docker volume create --label model_data
    docker run --rm -it -v model_data:/model_data alpine chmod 777 /model_data
    
    # Put models into the shared volume
    for model in "${models_arr[@]}"
    do
      docker run --rm -it -v model_data:/app/models -e ACCEPT_LICENSE=true $IMAGE_REGISTRY/$model
    done
    
    # If TLS credentials are set up, run with TLS
    tls_args=""
    if [ "$TLS_CERT" != "" ] && [ "$TLS_KEY" != "" ]
    then
      echo "Running with TLS"
      tls_args="$tls_args -v $(real_path ${TLS_KEY}):/tls/server.key.pem"
      tls_args="$tls_args -e TLS_SERVER_KEY=/tls/server.key.pem"
      tls_args="$tls_args -e SERVE_KEY=/tls/server.key.pem"
      tls_args="$tls_args -v $(real_path ${TLS_CERT}):/tls/server.cert.pem"
      tls_args="$tls_args -e TLS_SERVER_CERT=/tls/server.cert.pem"
      tls_args="$tls_args -e SERVE_CERT=/tls/server.cert.pem"
      tls_args="$tls_args -e PROXY_CERT=/tls/server.cert.pem"
    
      if [ "$CA_CERT" != "" ]
      then
        echo "Enabling mTLS"
        tls_args="$tls_args -v $(real_path ${CA_CERT}):/tls/ca.cert.pem"
        tls_args="$tls_args -e TLS_CLIENT_CERT=/tls/ca.cert.pem"
        tls_args="$tls_args -e MTLS_CLIENT_CA=/tls/ca.cert.pem"
        tls_args="$tls_args -e PROXY_MTLS_KEY=/tls/server.key.pem"
        tls_args="$tls_args -e PROXY_MTLS_CERT=/tls/server.cert.pem"
      fi
    
      echo "TLS args: [$tls_args]"
    fi
    
    # Run the runtime with the models mounted
    docker run ${@} \
      --rm -it \
      -v model_data:/app/model_data \
      -e ACCEPT_LICENSE=true \
      -e LOCAL_MODELS_DIR=/app/model_data \
      -p 8085:8085 \
      -p 8080:8080 \
      $tls_args $IMAGE_REGISTRY/$RUNTIME_IMAGE
    
  3. Make a request to the running container:

    curl -s \
      "http://localhost:8080/v1/watson.runtime.nlp.v1/NlpService/SyntaxPredict" \
      -H "accept: application/json" \
      -H "content-type: application/json" \
      -H "grpc-metadata-mm-model-id: syntax_izumo_lang_en_stock" \
      -d '{ "raw_document": { "text": "This is a test sentence" }, "parsers": ["token"] }'
    

Next steps

Once you have your runtime server working, see Accessing client libraries and tools to continue.