Configuring Document Processing

Before you install, configure the custom resource YAML file for your Document Processing deployment.

Before you begin

Make sure you follow the instructions in Preparing to install Document Processing, where you configure important settings such as secrets, databases, and user access.

About this task

The containers that you deploy for Automation Document Processing vary according to the environment that you deploy for. If you use the pattern deployment script to prepare your CR file, you specify one of the following options:
  • Authoring or Development Environment - The authoring environment that includes all of the containers for creating, testing, and deploying your document processing project and application.
  • Runtime Environment - The test, staging, and production environment that includes a subset of containers for deploying and hosting your document processing application and data.
Parameters that need custom values have a value of <Required>, which you must update with a real value before you apply the CR. You provide values for the environment that you prepared in advance:
  • Databases
  • LDAP settings
  • Names of the secrets that you prepared in advance to protect sensitive data
  • Persistent volumes and claims, or storage classes for dynamic storage creation
Note:

If you selected to use EDB Postgres as your deployment database and you want to use EDB Postgres for Document Processing, check that the datasource_configuration.dc_gcd_datasource.dc_use_postgres, datasource_configuration.dc_os_datasource.dc_use_postgres, datasource_configuration.dc_icn_datasource.dc_use_postgres, and datasource_configuration.dc_ca_datasource.dc_use_postgres parameters are set to true.

If you do not want to use EDB Postgres for Document Processing, set the values to false. Then check the values of the external database parameters that you generated when you ran the cp4a-prerequisites.sh script.

Remember: The Document Processing environment includes other capabilities. When you configure your CR YAML updates, use the information that is provided for the other capabilities to ensure that all of your values are complete.

Procedure

  1. Open the YAML file that you created in Option 1a: (Recommended) Generating the custom resource with the deployment script.
  2. Confirm that your pattern value is set for Document processing:
    sc_deployment_patterns: document_processing
  3. Confirm or add the appropriate settings for the environment you want to deploy:
    For the development or authoring environment:
    sc_optional_components: ae_data_persistence, document_processing_designer
    For the Runtime environment:
    sc_optional_components: ae_data_persistence, document_processing_runtime

    If you are deploying a Runtime environment, import the authoring environment's Zen route certificate to the runtime environment. For more information, see Importing the Zen route certificate to the runtime environment.

    You can also use the sc_optional_components parameter to include Content Search Services (cs), and Content Management Interoperability Services (cmis) in your deployment.
  4. Configure your GPU settings for the Deep Learning container:
    For the NVIDIA GPUs for your worker nodes that enable the Deep Learning container, confirm or set the parameters for the Deep Learning configuration:
    ca_configuration:
        # Deep Learning configuration
        deeplearning:
          gpu_enabled: true # true or false.  Set it to true if you have a GPU enabled worker nodes
          nodelabel_key: nvidia.com/gpu.present # The unique node label key/value on the GPU node. For example: ibm-cloud.kubernetes.io/gpu-enabled:true.  Set this value when `gpu_enabled` is set to true
          nodelabel_value: true # The node label value on the GPU node.  For example: "true". Set this value when `gpu_enabled` is set to true
          replica_count: 1 # If gpu_enabled is set to true, we expect you have at least 2 GPU to achieve HA configuration with 2 replicas.
    You can use the following command to see the labels that are available on your worker nodes:
    oc get nodes --show-labels

    Node information is also viewable in the Open Shift console.

  5. Set the add-ons in the Initialization section.
    If you plan to use the Content Event Webhook feature, add the Event-Driven External Service Invocation Extensions in the add-ons list along with the other add-ons that are required for Document Processing:
    c_obj_store_creation.object_stores.oc_cpe_obj_store_addons_list: "{CE554ADD-0000-0000-0000-000000000021}"
    - "{CE460ADD-0000-0000-0000-000000000004}"
    - "{CE460ADD-0000-0000-0000-000000000001}"
    - "{CE460ADD-0000-0000-0000-000000000003}"
    - "{CE511ADD-0000-0000-0000-000000000006}"
    - "{CE460ADD-0000-0000-0000-000000000008}"
    - "{CE460ADD-0000-0000-0000-000000000007}"
    - "{CE460ADD-0000-0000-0000-000000000009}"
    - "{CE460ADD-0000-0000-0000-00000000000B}"
    - "{CE460ADD-0000-0000-0000-00000000000D}"
  6. Set all other required configuration parameters. For a complete list, see Document Processing configuration parameters.
  7. Complete the sections for the other included capabilities, as applicable:
    Important: For Automation Document Processing, some of the database and data source settings in the custom resource YAML have specific naming requirements. When you supply parameter values in the YAML, keep the following requirements in mind:
    ## For the "document_processing" pattern, the following parameters below MUST remain unchanged and have the default value:
          ##   dc_os_label: "devos1" 
          ##   dc_common_os_datasource_name: "DEVOS1DS"
          ##   dc_common_os_xa_datasource_name: "DEVOS1DSXA"
    ***
    ## For the "document_processing" pattern, the following parameters below MUST remain unchanged and have the default value:
          ##   dc_os_label: "aeos"
          ##   dc_common_os_datasource_name: "AEOS"
          ##   dc_common_os_xa_datasource_name: "AEOSXA"
    Requirements for parameter values:
    • Authoring/Development environment: Datasource values as specified in the example must use the prepared default values.
    • Runtime environment: Datasource values for the devos1 object store can be updated to any value, but the values for the aeos object store must continue to use the prepared default values.

What to do next

Continue to configure the other capabilities that are in your CR file, and make sure that you complete the last step Validating the YAML in your custom resource file before you apply the CR to the operator.