Managing resources for document text processing models

To use the text processing APIs to classify and extract textual information from your documents in IBM watsonx.ai, you must deploy a set of document understanding models in your cluster. You can then customize the resources that the text processing pipeline uses to classify and extract text from your documents.

Before you begin

  • You must be an instance administrator.
  • The document text processing models must be installed in your cluster. For details, see Installing models with the default configuration.
    Restriction: You cannot add text processing foundation models to a watsonx.ai™ lightweight engine installation.

Procedure

You can change the default cluster configuration for text processing pods to optimize memory usage, garbage collection frequency, and more to scale the capacity of the text processing pipeline.

  1. You can use the following methods to customize the resources used by the document text processing pods in your deployment:
    Configuring resources for text processing pods for each deployment and container
    Optimize the memory used by the document understanding library by setting a custom environment variable in the custom resource for each deployment and containers within the deployment.

    The following example sets the memory usage for multiple deployments (such as wdu_api_deploy_distributed) and containers (such as wdu_runtime) to the predefined MEMORY_MINIMAL value:

    oc patch watsonxaiifm watsonxaiifm-cr \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --type=merge \
    -p '{"spec": {"model_install_parameters": {"wdu": {"wdu_api_deploy_distributed": {"wdu_runtime": {"env": [{"name": "MEMORY_MINIMAL", "value": "true" }]}}, "wdu_page_deploy_distributed": {"wdu_runtime": {"env": [{"name": "MEMORY_MINIMAL", "value": "true" }]}, "wdu_model_copy": {"env": [{"name": "MEMORY_MINIMAL", "value": "true" }]}}, "wdu_result_deploy_distributed": {"wdu_runtime": {"env": [{"name": "MEMORY_MINIMAL", "value": "true" }]}}, "wdu_watch_deploy_distributed": {"wdu_runtime": {"env": [{"name": "MEMORY_MINIMAL", "value": "true" }]}}}}}}'
    Configuring resources for all text processing pods globally
    Use the container_defaults parameter to apply the same resource settings to all text processing containers as follows:
    oc patch watsonxaiifm watsonxaiifm-cr \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --type=merge \
    -p '{"spec": {"model_install_parameters": {"wdu": {"container_defaults": {"env": [{"name": "MEMORY_MINIMAL", "value": "true"}, {"name": "MEMORY_GC_FREQUENCY", "value": "5"}]}}}}}'
    Configuring resources for text processing pods with global default values and custom overrides for specific deployments
    Use a combination of global settings specified in the container_defaults parameter for all containers and custom settings for a specific deployment (such as wdu_api_deploy_distributed) that override the global defaults as follows:
    oc patch watsonxaiifm watsonxaiifm-cr \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --type=merge \
    -p '{"spec": {"model_install_parameters": {"wdu": {"container_defaults": {"env": [{"name": "MEMORY_MINIMAL", "value": "true"}, {"name": "LOG_LEVEL", "value": "INFO"}]}, "wdu_api_deploy_distributed": {"wdu_runtime": {"env": [{"name": "MEMORY_MINIMAL", "value": "false"}]}}}}}}'
  2. Confirm the operator reconciles successfully and does not report errors. You can then check the text processing pod configuration as follows:
    oc describe pod <wdu-api-deploy-distributed-pod-name> -n ${PROJECT_CPD_INST_OPERANDS} | grep -A 50 "wdu-runtime:"
  3. Optional: If you enable debug logging, you can confirm the container_defaults are applied successfully:
    oc logs -n ${PROJECT_CPD_INST_OPERANDS} -l app.kubernetes.io/name=ibm-cpd-watsonx-ai-ifm-operator --tail=200 | grep -i "container_defaults\|env"