IBM Automation Document Processing parameters

You use configuration parameters to install and configure the document processing containers. Complete the custom resource YAML file for your deployment by supplying the appropriate values for your supporting environment and your configuration. The operator uses the YAML file to deploy and manage your containers.

For values that are different based on your deployment profile, see system requirements.

Table 1. Data source Configuration parameters: spec.datasource_configuration
Parameter name Description Default/Example Values Required
dc_ca_datasource.dc_database_type The database type for Document Processing engine. The possible values are db2, db2HADR, or postgresql Yes
dc_ca_datasource.database_servername The name of the Db2® server that hosts databases for Document Processing.

If the Db2 name is not resolvable by DNS, provide the IP address for the database_ip parameter when the dc_database_type parameter is set to db2HADR.

MyDBServName Yes
dc_ca_datasource.database_name The BASE database name. For example, BASECA. Yes
dc_ca_datasource.tenant_databases List of one or more names of project (tenant) databases. For example,

- t01db

Yes
dc_ca_datasource.database_port The port to the database server. For Db2, the default is 50000. Yes
dc_ca_datasource.dc_database_ssl_enabled Enable SSL or TLS for database communication. The default is true. No
dc_ca_datasource.dc_database_ssl_mode Optional. Corresponds to the sslmode argument for Postgres client.

Only applicable if dc_database_type is set to postgresql and if dc_database_ssl_enabled is set to true.

The possible values are:

require

verify-ca

verify-full

No
dc_ca_datasource.dc_hadr_standby_servername If you are using Db2 HADR, the Db2 standby server hostname.

If your standby database server name cannot be resolved by DNS, provide the corresponding IP address for the dc_hadr_standby_ip parameter.

MyStandbyServerName Yes if the dc_ca_datasource.dc_database_type parameter is set to db2HADR.
dc_ca_datasource.dc_hadr_standby_port If you are using Db2 HADR, the port to the Db2 standby server. For Db2, the default is 50000.

For Db2 HADR, an example is MyHADRStandbyPort.

Yes if the dc_ca_datasource.dc_database_type parameter is set to db2HADR.
dc_ca_datasource.dc_hadr_validation_timeout The validation timeout. 15 No
dc_ca_datasource.dc_hadr_retry_interval_for_client_reroute For Db2 HADR. The retry interval in seconds for the client reroute. 2 No
dc_ca_datasource.dc_hadr_max_retries_for_client_reroute For Db2 HADR. The maximum number of retries for the client reroute. 30 No
dc_ca_datasource.database_ip For Db2 HADR. The IP address of the primary Db2 server if the database_servername value for the server hostname cannot be resolved through DNS. MyDbIPaddr No
dc_ca_datasource.dc_hadr_standby_ip For Db2 HADR. The IP address of the standby Db2 server if the dc_hadr_standby_servername value for the standby server hostname cannot be resolved through DNS. MyStandbyDbIPaddr No
Table 2. Document Processing engine parameters: spec.ca_configuration
Parameter Description Default/Example Values Required
global.custom_labels Enable custom labeling for Document Processing engine objects. For example:
ca_configuration:
 global:
  custom_labels:
   myLabel: "My test label"
No
global.custom_annotations Enable custom annotations for Document Processing engine objects. For example:
ca_configuration:
 global:
  custom_annotations:
   myAnnotation: "My test annotation"
No
global.node_affinity Enable custom node affinity for Document Processing engine containers. These custom expressions are part of the requiredDuringSchedulingIgnoredDuringExecution. For more information, see Node affinity Kubernetes documentation. For example:
ca_configuration:
 global:
  node_affinity:
   custom_node_selector_match_expression:
    - key: kubernetes.io/os
       operator: In
       values:
       - linux
No
global.runtime_feedback.enabled Enable the feedback feature to fine tune classification and extraction models by leveraging runtime data with minimum Designer interaction. This feature is only supported with projects deployed in the Authoring environment (for example sandbox). For more information about deploying projects in an Authoring environment, see Deploying your Document Processing project in an authoring environment. True No
global.runtime_feedback.runtime_type The runtime environment type that the feedback is initiated from. The only supported value is sandbox. sandbox No
global.deployment_profile_size Optional setting for the deployment profile size for Document Processing engine.
  • If this parameter is set, its value overrides the equivalent setting of the shared_configuration.sc_deployment_profile_size.
  • If this parameter is not set, the value of the shared_configuration.sc_deployment_profile_size is used. If this value is not set, the default value is small.
Note: You can try out the Document Processing features without using too much resources with the entry profile. This profile is only applicable for the document_processing pattern. It should be set only under ca_configuration and not shared_configuration.sc_deployment_profile_size.

If set, the number of replicas is 2 for RabbitMQ and OCR Extraction, and 1 for all other containers.

Allowed values: entry, small, medium, large

Default value: small

No
global.route_ingress_annotations Additional route and Ingress annotations are applied to Document Processing routes and Ingress. For more information about the annotations, see the Route annotations table. By default, Document Processing sets the haproxy.router.openshift.io/timeout: to 120s and haproxy.router.openshift.io/disable_cookies: to true. You can override the timeout value by setting a new value in seconds. 120s No
global.metrics Metrics for the container true No
global.arch The architecture of the product that is built and compiled with. amd64 No
global.db_secret The Kubernetes secret for the Db2 credentials. aca-basedb No
global.seccomp_profile.type Optional setting for seccomp (Secure Computing Mode) profile for Document Processing engine containers. You can also define the seccomp profile globally at shared_configuration.sc_seccomp_profile. The default seccomp profile is RuntimeDefault.

For more information, see the Kubernetes and RedHat OpenShift documentation about seccomp.

The default is RuntimeDefault if not defined. No
global.seccomp_profile.localhostProfile The name of the custom seccomp profile when the type Localhost is used.
Attention: If you define a custom, localhost seccomp profile that is stricter than the default RuntimeDefault profile, some of the Document Processing engine containers might fail to start.
N/A No
global.service_account The custom service account name for the Document Processing engine. If global.service_account is configured, then you must create the appropriate Role and RoleBinding objects as well.

For more information, see the Kubernetes documentation to configure service accounts for pods and using RBAC authorization.

The default is <meta.name>-aca-service-account if the parameter is not set.
If the parameter is set, the following example rules represent a set of permissions on the Role. Use these rules in your Role YAML file to create the Role.
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "watch", "list"]
- apiGroups: [""]
  resources: ["secrets", "endpoints"]
  verbs: ["get", "update", "create", "patch", "delete"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "watch", "list", "create", "update", "patch", "delete"]
No
global.auto_scaling.enabled Enable Horizontal Pod Autoscaler (HPA) for Document Processing engine at the global level. HPA applies to all deployments except RabbitMQ. These parameters are applied if the auto_scaling section under individual components is not defined. false No
global.auto_scaling.target_cpu_utilization_percentage Define the target CPU usage for all Document Processing engine components to start scaling out new Document Processing engine pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
global.auto_scaling.min_replicas Define the minimum number of replicas to use with HPA. You must define this value if the global.auto_scaling.enabled parameter is set to true. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
global.auto_scaling.max_replicas Define the maximum number of replicas to use with HPA. You must define this value if the global.auto_scaling.enabled parameter is set to true. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
global.image.repository The repository name for Docker images. cp.icr.io/cp/cp4a/iadp No
global.image.tag The container release build. An example is 21.0.2. No
global.image.pull_policy The Docker image pull policy. Better results are observed if you leave the default value as IfNotPresent. No
global.retries The maximum number of retries for the deployment of the container until all pods are in Ready status. The delay between two retries is 20 seconds. 90 No
global.logs.log_rotate_count The number of log files to keep before they are rolled over. The default is 20. No
global.logs.log_file_max_size The maximum size of a log file. Use k for KB, m for MB, and g for GB. 50m No
global.logs.claimname The PVC name for storing log files. If you set this parameter to none or NONE, the operator does not create any PVC to store logs. Example values: none, sp-log-pvc

The default is none.

No
global.logs.log_level The log level for Document Processing engine components. It is better to set it to error in a production environment to increase the processing throughput. info

Other valid values are error and debug.

No
global.logs.size The size of the log persistent volume claim (PVC). This parameter applies only if a value is specified for the global.logs.claimname parameter. 5Gi No
global.data.claimname The PVC name for storing data files. If you set this parameter to none or NONE, the operator does not create any PVC to store data. Example values: none, sp-data-pvc

The default is none.

No
global.data.size The size of the data persistent volume claim (PVC). This parameter applies only if a value is specified for the global.data.claimname parameter. 5Gi No
global.rabbitmq.resources.limits.memory Specify a RabbitMQ memory limit for the container. 1024Mi No
global.rabbitmq.resources.limits.cpu Specify a CPU limit for the container. 1 No
global.rabbitmq.replica_count How many RabbitMQ replicas to deploy initially. 3 No
jobs.hourly_cleanup.schedule Set the schedule for the job. It must be in a Unix cron format, for example:
0/15 * * * *
The default is "0 * * * *", which means the job runs every hour. No
jobs.daily_cleanup.runtime_doc_retention Indicate how long the document is kept before it is deleted. The value can be expressed in the following ways:
  • Number in seconds. For example: 3600
  • Duration-like strings in minutes, hours, and days. For example: 30m, 1hr, 2days, 1d
The default is 1d (one day). No
jobs.daily_cleanup.schedule Set the schedule for the job. It must be in a Unix cron format, for example:
0/15 * * * *
The default is "30 1 * * *", which means the job runs daily at 1:30 AM. No
spbackend.port If you do not set this parameter, a random port is generated for the container's backend service. Pick a unique port when you set this value. Valid values range from 30000 through 32767. No
spbackend.replica_count How many spbackend replicas to deploy initially. 2 No
spbackend.auto_scaling.enabled Enable HPA for the spbackend component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
spbackend.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for spbackend to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
spbackend.auto_scaling.min_replicas The minimum number of replicas for spbackend to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
spbackend.auto_scaling.max_replicas The maximum number of replicas for spbackend to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
spbackend.resources.limits.memory Specify a memory limit for the container. 1Gi No
spbackend.resources.limits.cpu Specify a CPU limit for the container. 0.6 No
postprocessing.process_timeout Specify a process timeout in seconds for the container. 1500 No
postprocessing.replica_count How many postprocessing replicas to deploy initially. 2 No
postprocessing.auto_scaling.enabled Enable HPA for the postprocessing component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values.
false
No
postprocessing.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for postprocessing to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
postprocessing.auto_scaling.min_replicas The minimum number of replicas for postprocessing to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
postprocessing.auto_scaling.max_replicas The maximum number of replicas for postprocessing to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
postprocessing.max_unavailable_count Maximum unavailable count for postprocessing. 1 No
postprocessing.resources.limits.memory Specify a postprocessing memory limit for the container. 480Mi No
postprocessing.resources.limits.cpu Specify a CPU limit for the container. 1 No
setup.process_timeout Specify a process timeout in seconds for the container. 600 No
setup.replica_count How many setup replicas to deploy initially. 2 No
setup.auto_scaling.enabled Enable HPA for the setup component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
setup.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for setup to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
setup.auto_scaling.min_replicas The minimum number of replicas for setup to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
setup.auto_scaling.max_replicas The maximum number of replicas for setup to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
setup.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
setup.resources.limits.memory Specify a setup memory limit for the container. 1Gi No
setup.resources.limits.cpu Specify a CPU limit for the container. 0.6 No
ocrextraction.deep_learning_object_detection.enabled Enhance object detection by using deep learning. When this parameter is set to true, deep-learning pods are deployed. The deep learning container is used only in an authoring environment.

The ca_configuration.deeplearning parameters are used only if this deep_learning_object_detection parameter is set to true.

true for a production environment.

false for a starter deployment.

No
ocrextraction.use_iocr

This feature is a technology preview in 23.0.1.

Starts the Watson Document Understanding associated pods to process documents. Possible values are:
  • auto: The algorithm determines automatically if the documents meet the low-quality criteria and sends these documents to the WDU Runtime service for further processing.
  • all: All documents are processed with the WDU Runtime service.
  • none: The WDU Runtime service is not used to process documents.
none No
ocrextraction.process_timeout Specify a process timeout in seconds for the container. 600 No
ocrextraction.replica_count How many OCR extraction replicas to deploy initially. 4 No
ocrextraction.auto_scaling.enabled Enable HPA for the ocrextraction component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
ocrextraction.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for ocrextraction to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
ocrextraction.auto_scaling.min_replicas The minimum number of replicas for ocrextraction to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
ocrextraction.auto_scaling.max_replicas The maximum number of replicas for ocrextraction to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
ocrextraction.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
ocrextraction.resources.limits.memory Specify an OCR extraction memory limit for the container. 4096Mi No
ocrextraction.resources.limits.cpu Specify a CPU limit for the container. 1 No
classifyprocess.process_timeout Specify a process timeout in seconds for the container. 600 No
classifyprocess.replica_count How many classifyprocess replicas to deploy initially. 2 No
classifyprocess.auto_scaling.enabled Enable HPA for the classifyprocess component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
classifyprocess.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for classifyprocess to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
classifyprocess.auto_scaling.min_replicas The minimum number of replicas for classifyprocess to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
classifyprocess.auto_scaling.max_replicas The maximum number of replicas for classifyprocess to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
classifyprocess.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
classifyprocess.resources.limits.memory Specify a classifyprocess memory limit for the container. 960Mi No
classifyprocess.resources.limits.cpu Specify a CPU limit for the container. 1 No
processingextraction.process_timeout Specify a process timeout in seconds for the container. 600 No
processingextraction.replica_count How many processing extraction replicas to deploy initially. 2 No
processingextraction.auto_scaling.enabled Enable HPA for the processingextraction component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
processingextraction.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for processingextraction to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
processingextraction.auto_scaling.min_replicas The minimum number of replicas for processingextraction to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
processingextraction.auto_scaling.max_replicas The maximum number of replicas for processingextraction to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
processingextraction.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
processingextraction.resources.limits.memory Specify a processing extraction memory limit for the container. 2048Mi No
processingextraction.resources.limits.cpu Specify a CPU limit for the container. 1 No
naturallanguageextractor.process_timeout Specify a process timeout in seconds for the container. 300 No
naturallanguageextractor.replica_count How many naturallanguageextractor replicas to deploy initially. 2 No
naturallanguageextractor.auto_scaling.enabled Enable HPA for the naturallanguageextractor component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
naturallanguageextractor.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for naturallanguageextractor to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
naturallanguageextractor.auto_scaling.min_replicas The minimum number of replicas for naturallanguageextractor to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
naturallanguageextractor.auto_scaling.max_replicas The maximum number of replicas for naturallanguageextractor to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
naturallanguageextractor.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
naturallanguageextractor.resources.limits.memory Specify a naturallanguageextractor memory limit for the container. 1440Mi No
naturallanguageextractor.resources.limits.cpu Specify a CPU limit for the container. 1 No
naturallanguageextractor.model_cache_size Indicates how many Natural Language Extractor models are loaded into the cache. Loading models into the cache improves the NLE document processing task. The default is 20.

If you increase this number, it might cause the natural language extractor pod to run out of memory. You must monitor the NLE pod and increase the RAM limit appropriately.

No
deeplearning.process_timeout Specify a process timeout in seconds for the container. 604800 No
deeplearning.gpu_enabled Set to true if you have GPU-enabled worker nodes. true or false No
deeplearning.nodelabel_key The unique node label key-value on the GPU node. Set this parameter when the gpu_enabled parameter is set to true. For example, ibm-cloud.kubernetes.io/gpu-enabled:true. No
deeplearning.nodelabel_value The node label value on the GPU node. Set this parameter when the gpu_enabled parameter is set to true. For example, true. No
deeplearning.replica_count If the gpu_enabled parameter is set to true, you have at least 2 GPU to achieve HA configuration with 2 replicas. 2 No
deeplearning.auto_scaling.enabled Enable HPA for the deeplearning component. If you define autoscaling parameters for this component, they take precedence over the global.auto_scaling parameter values. false No
deeplearning.auto_scaling.

target_cpu_utilization_percentage

The target CPU usage for deeplearning to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
deeplearning.auto_scaling.min_replicas The minimum number of replicas for deeplearning to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
deeplearning.auto_scaling.max_replicas The maximum number of replicas for deeplearning to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
deeplearning.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
deeplearning.resources.limits.memory Specify a deeplearning memory limit for the container. Increase the RAM limits if you train a large data set. 6144Mi No
deeplearning.resources.limits.cpu Specify a CPU limit for the container. 2 No
deeplearning.resources.limits.gpu Set this parameter to a positive number if you have an NVIDIA GPU-enabled node and the gpu_enabled parameter is set to true. 1 No
webhook.process_timeout Specify a process timeout in seconds for the container. 604800 No
webhook.replica_count The number of webhook replicas to initially deploy. 2 No
webhook.auto_scaling.enabled Enable HPA for the webhook component. If you define autoscaling parameters for this component, the parameters take precedence over the global.auto_scaling parameter values. false No
webhook.auto_scaling.target_cpu_utilization_percentage The target CPU usage for webhook to start scaling out new pods. To reduce flapping (frequent scaling up and down), the target CPU usage must hold for 5 minutes before the new pod scales out. The scale-down policy uses the same CPU metric that is defined with this parameter. 90 No
webhook.auto_scaling.min_replicas The minimum number of replicas for webhook to use with HPA. A number is required if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
webhook.auto_scaling.max_replicas The maximum number of replicas for webhook to use with HPA. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
webhook.max_unavailable_count Specify a maximum unavailable count for the container. 1 No
webhook.resources.limits.memory Specify a webhook memory limit for the container. 500Mi No
webhook.resources.limits.cpu Specify a CPU limit for the container. 0.3 No
Note: Document processing is designed to be flexible such that you can increase the performance by increasing the following parameters.
  1. With ca_configuration.<component name>.replica_count , you can increase the number of replicas of Document Processing engine components to increase throughput if your environment has enough resources. Better results are usually observed with one component per node. Increasing the number of replicas might not increase the response time such as, for example, the time it takes to process a page from end-to-end.
  2. With ca_configuration.<component name>.limits.cpu , you can increase the CPU limit for the container's components to improve the response time.
Table 3. Mongo database parameters: spec.ecm_configuration.document_processing
Parameters Description Example Values Required
deploy_mongo Deploy Mongo database. Set this parameter to false if you do not want the operator to deploy the Mongo database and you want to use your own existing Mongo database. true No
mongo.arch.amd64 The architecture of the cluster. This value is the default for Linux® on x86. Do not change it. 3 - Most preferred No
mongo.replica_count The number of replicas that are deployed.
  • For MongoDB as deployed by the operator, the only supported value is 1 because of the version of MongoDB that is being deployed (non-HA).
  • If you deploy your own MongoDB, comment out the entire Mongo section under document_processing in the YAML file.
1 No
mongo.image.repository The image repository that corresponds to the image registry, which is where the image is pulled. cp.icr.io/cp/cp4a/iadp/mongo No
mongo.image.tag The image tag that corresponds to the image registry. 4.2.12 No
mongo.image.pull_policy This pull policy overrides the image pull policy that is set in the shared_configuration section. IfNotPresent No
mongo.resources.requests.cpu The initial CPU request. Adjust it according to meet your requirements. 500m No
mongo.resources.requests.memory The initial memory request. Adjust it according to meet your requirements. 256Mi No
mongo.resources.limits.cpu The initial CPU limits. Adjust it to meet your requirements. 1 No
mongo.resources.limits.memory The initial memory limits. Adjust it to meet your requirements. 1024Mi No
mongo.datavolume.existing_pvc_for_mongo_datastore Persistent Volume Claim (PVC) for Mongo. If the storage_configuration parameter is set in the shared_configuration section, the operator creates the data storage PVC by using the name that is provided. mongo-datastore No
mongo.probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
mongo.probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
mongo.probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
mongo.probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
mongo.probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
mongo.probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
mongo.probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
mongo.probe.startup.period_seconds The period in seconds. 10 No
mongo.probe.startup.timeout_seconds The timeout setting in seconds. 10 No
mongo.probe.startup.failure_threshold The threshold number for failures. 6 No
mongo.resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 750 Mi No
mongo.resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 750 Mi No
Table 4. Git Gateway parameters: spec.ecm_configuration.document_processing
Parameters Description Default/Example Values Required
gitgateway.arch.amd64 The architecture of the cluster. This value is the default for Linux on x86. Do not change it. 3 - Most preferred No
gitgateway.replica_count The number of replicas or pods to be deployed. For high availability in a production environment, 2 replicas or more is the recommended value. 2 No
gitgateway.image.repository The image repository that corresponds to the image registry, which is where the image is pulled. The default repository is the IBM Entitled Registry, cp.icr.io/cp/cp4a/iadp/gitgateway No
gitgateway.image.tag The image tag that corresponds to the image registry. 21.0.2 No
gitgateway.image.pull_policy This pull policy overrides the image pull policy in the shared_configuration section. IfNotPresent No
gitgateway.resources.requests.cpu The initial CPU request. Adjust it to meet your requirements. 500m No
gitgateway.resources.requests.memory The initial memory request. Adjust it to meet your requirements. 512Mi No
gitgateway.resources.limits.cpu The initial CPU limits. Adjust it to meet your requirements. 1 No
gitgateway.resources.limits.memory The initial memory limits. Adjust it to meet your requirements. 1024Mi No
gitgateway.auto_scaling.enabled By default, autoscaling is enabled. Adjust it to meet your requirements. true No
gitgateway.auto_scaling.max_replicas The maximum number of replicas that is allowed is 3. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). No
gitgateway.auto_scaling.min_replicas The minimum number of replicas that is allowed is 1. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
gitgateway.auto_scaling.target_cpu_utilization_percentage The CPU percentage before autoscaling occurs. 80 No
gitgateway.production_setting.log_trace_specification The Git Gateway production log trace. *=info No
gitgateway.production_setting.git_gateway_timeout The Git Gateway timeout setting. 60000 No
gitgateway.datavolume.existing_pvc_for_gitgateway_datastore The Persistent Volume Claim (PVC) for Mongo. If the storage_configuration parameter is set in the shared_configuration section, the operator creates the data storage PVC by using the name that is provided. gitgateway-datastore No
gitgateway.probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
gitgateway.probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
gitgateway.probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
gitgateway.probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
gitgateway.probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
gitgateway.probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
gitgateway.probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
gitgateway.probe.startup.period_seconds The period in seconds. 10 No
gitgateway.probe.startup.timeout_seconds The timeout setting in seconds. 10 No
gitgateway.probe.startup.failure_threshold The threshold number for failures. 6 No
gitgateway.resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 750 Mi No
gitgateway.resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 750 Mi No
Table 5. Content Designer Service parameters: spec.ecm_configuration.document_processing.cds
Parameters Description Default/Example Values Required
arch.amd64 The architecture of the cluster. This value is the default for Linux on x86. Do not change it. 3 - Most preferred No
replica_count The number of replicas or pods to be deployed. For high availability in a production environment, 2 replicas or more is the recommended value. 2 No
image.repository The image repository that corresponds to the image registry, which is where the image is pulled. cp.icr.io/cp/cp4a/iadp/cds No
image.tag The image tag that corresponds to the image registry. 21.0.2 No
image.pull_policy This pull policy overrides the image pull policy in the shared_configuration section of the custom resource. IfNotPresent No
resources.requests.cpu The initial CPU request. Adjust it to meet your requirements. 500m No
resources.requests.memory The initial memory request. Update here to meet your requirements. 512Mi No
resources.limits.cpu The initial CPU limits. Adjust it to meet your requirement. 1 No
resources.limits.memory The initial memory limits. Adjust it to meet your requirement. 1024Mi No
resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 1 Gi No
resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 2 Gi No
auto_scaling.enabled By default, autoscaling is enabled. Update this setting to meet your requirement. true No
auto_scaling.max_replicas The maximum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.min_replicas The minimum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.target_cpu_utilization_percentage The CPU percentage before autoscaling occurs. 80 No
production_setting.jvm_customize_options Specify JVM arguments by using comma separation. For example, if you want to set the following JVM arguments:
  • -Dmy.test.jvm.arg1=123
  • -Dmy.test.jvm.arg2=abc
  • -XX:+SomeJVMSettings
  • -XshowSettings:vm
set the parameter as follows:
jvm_customize_options="-Dmy.test.jvm.arg1=123,
-Dmy.test.jvm.arg2=abc,
-XX:+SomeJVMSettings,
-XshowSettings:vm"
No
production_setting.license Set the license to use Content Designer Service in a production environment. accept No
monitor_enabled Enable or disable monitoring where metrics can be sent to Graphite or scraped by Prometheus. false No
logging_enabled Enable or disable logging where logs can be sent to Elasticsearch. false No
datavolume.existing_pvc_for_cds_logstore Persistent Volume Claim for Content Designer Service. If the storage_configuration parameter is set in the shared_configuration section of the custom resource, the operator creates the logging storage PVC by using the name that is provided. cds-logstore No
probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
probe.startup.period_seconds The period in seconds. 10 No
probe.startup.timeout_seconds The timeout setting in seconds. 10 No
probe.startup.failure_threshold The threshold number for failures. 6 No
Table 6. Content Designer Repository API parameters: spec.ecm_configuration.document_processing.cdra
Parameters Description Default/Example Values Required
arch.amd64 The architecture of the cluster. This variable is the default for Linux on x86. Do not change it. 3 - Most preferred No
replica_count The number of replicas or pods to be deployed. The default is one replica. For high availability in a production environment, 2 replicas or more is the recommended value. 1 No
image.repository The image repository that corresponds to the image registry, which is where the image is pulled. The default repository is the IBM Entitled Registry, cp.icr.io/cp/cp4a/iadp/cdra No
image.tag The image tag that corresponds to the image registry. 21.0.2 No
image.pull_policy This pull policy overrides the image pull policy that is specified in the shared_configuration section of the custom resource. IfNotPresent No
resources.requests.cpu The initial CPU request. Adjust it to meet your requirements. 500m No
resources.requests.memory The initial memory request. Adjust it to meet your requirements. 1024Mi No
resources.limits.cpu The initial CPU limits. Adjust it to meet your requirement. 1 No
resources.limits.memory The initial memory limits. Adjust it to meet your requirement. 3072Mi No
resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 1 Gi No
resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 1 Gi No
auto_scaling.enabled By default, autoscaling is enabled. Update this setting to meet your requirement. true No
auto_scaling.max_replicas The maximum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.min_replicas The minimum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.target_cpu_utilization_percentage This parameter is the CPU percentage before autoscaling occurs. 80 No
production_setting.jvm_initial_heap_percentage The initial use of available memory. 66 No
production_setting.jvm_max_heap_percentage The maximum percentage of available memory to use. 66 No
production_setting.jvm_customize_options Specify JVM arguments by using comma separation. For example, if you want to set the following JVM arguments:
  • -Dmy.test.jvm.arg1=123
  • -Dmy.test.jvm.arg2=abc
  • -XX:+SomeJVMSettings
  • -XshowSettings:vm
set the parameter as follows:
jvm_customize_options="-Dmy.test.jvm.arg1=123,
-Dmy.test.jvm.arg2=abc,
-XX:+SomeJVMSettings,
-XshowSettings:vm"
No
production_setting.license Set the license to use CDRA in a production environment. accept No
monitor_enabled Enable or disable monitoring where metrics can be sent to Graphite or scraped by Prometheus. false No
logging_enabled Enable or disable logging where logs can be sent to Elasticsearch. false No
collectd_enable_plugin_write_graphite The plug-in for Graphite is enabled or disabled to emit container metrics. false No
datavolume.existing_pvc_for_cdra_cfgstore Persistent Volume Claims for CDRA. If the storage_configuration parameter in the shared_configuration section is configured, the operator creates the configuration storage PVC by using the name that is provided with this parameter. cdra-cfgstore No
datavolume.existing_pvc_for_cdra_logstore Persistent Volume Claim for CDRA. If the storage_configuration parameter is set in the shared_configuration section, the operator creates the logging storage PVC by using the name that is provided with this parameter. cdra-logstore No
probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
probe.startup.period_seconds The period in seconds. 10 No
probe.startup.timeout_seconds The timeout setting in seconds. 10 No
probe.startup.failure_threshold The threshold number for failures. 6 No
Table 7. Content Project Designer Service parameters: spec.ecm_configuration.document_processing.cpds
Parameters Description Default/Example Values Required
arch.amd64 The architecture of the cluster. This variable is the default for Linux on x86. Do not change it. 3 - Most preferred No
replica_count The number of replicas or pods to be deployed. The default is one replica. For high availability in a production environment, 2 replicas or more is the recommended value. 1 No
image.repository The image repository that corresponds to the image registry, which is where the image is pulled. cp.icr.io/cp/cp4a/iadp/cpds No
image.tag The image tag that corresponds to the image registry. 21.0.2 No
image.pull_policy This pull policy overrides the image pull policy in the shared_configuration section of the custom resource. IfNotPresent No
resources.requests.cpu The initial CPU request. Adjust it to meet your requirements. 500m No
resources.requests.memory The initial memory request. Adjust it to meet your requirements. 512Mi No
resources.limits.cpu The initial CPU limits. Adjust it to meet your requirement. 1 No
resources.limits.memory The initial memory limits. Adjust it to meet your requirement. 1024Mi No
resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 1 Gi No
resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 1 Gi No
auto_scaling.enabled By default, autoscaling is enabled. Update this setting to meet your requirement. true No
auto_scaling.max_replicas The maximum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.min_replicas The minimum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.target_cpu_utilization_percentage This parameter is the CPU percentage before autoscaling occurs. 80 No
production_setting.jvm_initial_heap_percentage The initial use of available memory. 18 No
production_setting.jvm_max_heap_percentage The maximum percentage of available memory to use. 33 No
production_setting.jvm_customize_options Specify JVM arguments by using comma separation. For example, if you want to set the following JVM arguments:
  • -Dmy.test.jvm.arg1=123
  • -Dmy.test.jvm.arg2=abc
  • -XX:+SomeJVMSettings
  • -XshowSettings:vm
set the parameter as follows:
jvm_customize_options="-Dmy.test.jvm.arg1=123,
-Dmy.test.jvm.arg2=abc,
-XX:+SomeJVMSettings,
-XshowSettings:vm"
No
production_setting.license CPDS production setting license. accept No
repo_service_url The repository service URL.
  • For a development environment, the default is "https://{{ meta.name }}-cdra-svc:9443/capital"
  • For a runtime environment, the default is "https://{{ meta.name }}-cdra-svc:9443/cdapi"
No
monitor_enabled Enable monitor for CPDS false No
logging_enabled Enable logging for CPDS false No
datavolume.existing_pvc_for_cpds_cfgstore Persistent Volume Claims for CPDS. If the storage_configuration parameter is set in the shared_configuration section, the operator creates the configuration storage PVC by using the name that is provided with this parameter. cpds-cfgstore No
datavolume.existing_pvc_for_cpds_logstore Persistent Volume Claim for CPDS. If the storage_configuration parameter is set in the shared_configuration section, the operator creates the logging storage PVC by using the name that is provided with this parameter. cpds-logstore No
probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
probe.startup.period_seconds The period in seconds. 10 No
probe.startup.timeout_seconds The timeout setting in seconds. 10 No
probe.startup.failure_threshold The threshold number for failures. 6 No
Table 8. Viewer Service parameters: spec.ecm_configuration.document_processing.viewone
Parameters Description Default/Example Values Required
arch.amd64 The architecture of the cluster. This variable is the default for Linux on x86. Do not change it. 3 - Most preferred No
replica_count The number of replicas or pods to be deployed. For high availability in a production environment, 2 replicas or more is the recommended value. 2 No
image.repository The image repository that corresponds to the image registry, which is where the image is pulled. cp.icr.io/cp/cp4a/iadp/viewone No
image.tag The image tag that corresponds to the image registry. 21.0.2 No
image.pull_policy This pull policy overrides the image pull policy in the shared_configuration section of the custom resource. IfNotPresent No
resources.requests.cpu The initial CPU request. Adjust it to meet your requirements. 500m No
resources.requests.memory The initial memory request. Adjust it to meet your requirements. 1024Mi No
resources.limits.cpu The initial CPU limits. Adjust it to meet your requirement. 1 No
resources.limits.memory The initial memory limits. Adjust it to meet your requirement. 4096Mi No
resources.requests.ephemeral_storage Specify an ephemeral storage request for the container. 1 Gi No
resources.limits.ephemeral_storage Specify an ephemeral storage limit for the container. 1 Gi No
auto_scaling.enabled By default, auto_scaling is enabled. Update this setting to meet your requirement. true No
auto_scaling.max_replicas The maximum number of replicas. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.min_replicas The minimum number of replicas is 1 but 2 is recommended. Requires a number if the auto_scaling.enabled parameter is set to true. For more information, see Tuning the Horizontal Pod Autoscaler (HPA). Yes if the auto_scaling.enabled parameter is set to true.
auto_scaling.target_cpu_utilization_percentage This parameter is the CPU percentage before autoscaling occurs. 80 No
production_setting.font_sync_interval_min Font synchronization interval in minutes. 0 No
production_setting.add_env_variables Extra environment variables that are a comma-separated list of key-value pairs to be set as environment variables before a container starts up. You can add environment variables to your system by setting the add_env_variables parameter in the custom resource: ADD_ENV_VARIABLES="ENV_VAR1=VALUE1,ENV_VAR2=VALUE2". No
production_setting.jvm_initial_heap_percentage The initial use of available memory. 40 No
production_setting.jvm_max_heap_percentage The maximum percentage of available memory to use. 66 No
production_setting.jvm_customize_options Specify JVM arguments by using comma separation. For example, if you want to set the following JVM arguments:
  • -Dmy.test.jvm.arg1=123
  • -Dmy.test.jvm.arg2=abc
  • -XX:+SomeJVMSettings
  • -XshowSettings:vm
set the parameter as follows:
jvm_customize_options="-Dmy.test.jvm.arg1=123,
-Dmy.test.jvm.arg2=abc,
-XX:+SomeJVMSettings,
-XshowSettings:vm"
No
monitor_enabled Enable or disable monitoring where metrics can be sent to Graphite or scraped by Prometheu false No
logging_enabled Enable or disable logging where logs can be sent to Elasticsearch false No
datavolume.existing_pvc_for_viewone_cacherootstore Persistent volume claim. If the storage_configuration parameter in the shared_configuration section is configured, the Operator creates the configuration storage PVC by using the name that is provided with this parameter. viewone-cacherootstore No
datavolume.existing_pvc_for_viewone_docrepositoryrootstore Persistent volume claim for document repository rootstore. viewone-docrepositoryrootstore No
datavolume.existing_pvc_for_viewone_workingpathstore Persistent volume claim for working pathstore. viewone-workingpathstore No
datavolume.existing_pvc_for_viewone_externalresourcepathstore Persistent volume claim for external resource pathstore. viewone-externalresourcepathstore No
datavolume.existing_pvc_for_viewone_logsstore Persistent volume claim for logstore. viewone-logsstore No
data volume.existing_pvc_for_viewone_customerfontsstore Persistent volume claim for customer fontstore. viewone-customerfontsstore No
datavolume.existing_pvc_for_viewone_configstore Persistent Volume Claims for configstore. viewone-configstore No
probe.readiness.period_seconds Modify the period to meet your requirements. 10 No
probe.readiness.timeout_seconds Modify the timeout to meet your requirements. 10 No
probe.readiness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.liveness.period_seconds Modify the period to meet your requirements. 10 No
probe.liveness.timeout_seconds Modify the timeout to meet your requirements. 5 No
probe.liveness.failure_threshold Modify the failure threshold to meet your requirements. 6 No
probe.startup.initial_delay_seconds The behavior of the startup probe to know when the container is started. 120 No
probe.startup.period_seconds The period in seconds. 10 No
probe.startup.timeout_seconds The timeout setting in seconds. 10 No
probe.startup.failure_threshold The threshold number for failures. 6 No