Creating access definitions

In IBM® Optim Archive, access definitions specify the scope of data to be extracted from source systems. Access definitions define the tables and related tables to be used during extract operations.

Access definitions can ensure consistent and accurate data extraction across your flows. You can use access definitions to identify and define the scope of data to be extracted from source connections or archives.

Who can complete this task?

Administrators or users can create access definitions. Standard users can see and work only with the access definitions that they created. Administrators can see and work with all access definitions in the system, regardless of who created them.

Before you begin

Before you define access conditions, ensure that there is a valid connection to the source database that you want to use in this flow.

Creating access definitions by using the flow builder

You can define access definitions from within the flow builder tool by dragging an Access definition node onto a flow builder canvas. Access definition nodes can follow a source Connection or Archive node in a flow or, if an access definition is already defined based on an existing source, then you can start with the access definition.

Figure 1. Access definitions are the second step in the flow of data in IBM Optim
Data flows from the source, through the access definition, then undergoes subsetting and masking before going to the target

To add an access definition to a flow:

  1. From the navigation menu, click Workflow icon Workflow > Flow builder, then open a flow builder canvas:
    • Load an existing flow from the My flows tab.
      OR
    • Click New flow and add a Connection node to the canvas.
  2. From the Nodes tab, drag an existing Access definition node onto the canvas or click Access definition > New access definition.
  3. Enter a name and an optional description for the access definition. Click Next.
  4. Browse the existing connections to select a base table for this access definition. Click Next.
  5. Optionally, select Include related tables, then configure whether you want this access definition to include related tables, and define the types of relationships you want to include.
  6. Optionally, select Include reference tables to include unrelated tables that contain data that is needed for context. Click Select reference tables, then choose the additional tables that you want to include. Click Next.
  7. Optionally, Select related tables and columns to exclude from each table in your access definition. For more details, see Managing column exclusion in access definitions.
  8. Optionally, you can create relationships in the Relationship Page.
  9. Review the access definition that you have configured, then click Create.
  10. Click Save to save the flow.
  11. Finish building the flow by adding more nodes as necessary. For more information about creating flows, see Building an IBM Optim Archive flow.

Managing column exclusion in access definitions

Column exclusion provides granular control over which columns are included in data extraction operations. You can selectively include or exclude specific columns from tables in access definitions to control which data is extracted and to improve data privacy compliance.

This capability helps organizations comply with data privacy requirements by excluding sensitive columns and reduces data exposure in non-production environments. You can manage column selections for source tables (base and reference tables) and related tables that are discovered through table dependencies.

You can exclude columns for base tables directly within the base table pane itself, without needing to navigate to a separate location. For the initial configuration, set up your base table (start table), and exclude columns directly in the base table configuration pane. After the initial setup, review and edit the excluded columns for the base table only in the Source Tables tab within the Exclude Columns pane.

To manage column exclusion in access definitions:

  1. Click the Source Tables tab to manage columns for your base table and any reference tables you have included.
  2. For each source table, click Select columns to open the column selection modal.
  3. In the column selection modal, review the list of available columns and deselect any columns that you want to exclude from the data extraction. Selected columns are included in the data extraction; deselected columns are excluded.
  4. Click Save to apply your column selections for the current table. The system displays the number of excluded columns next to the table name in expandable text format.
  5. Click the Related Tables tab to manage columns for tables discovered through relationship dependencies (parent and child tables).
  6. Repeat the column selection process for each related table by clicking Select columns, deselecting columns to exclude, and clicking Apply.
  7. Review the access definition summary to verify that the correct columns are excluded from each table. Excluded columns are tracked automatically and displayed in the access definition details view.
  8. Click Next to save the access definition with your column exclusion settings.

When you run jobs by using an access definition with column exclusion settings, only the selected columns are included in the data extraction, and excluded columns are omitted to protect sensitive data and reduce data set size.

Best practices for column exclusion:

  • Identify and exclude columns containing personally identifiable information (PII), financial data, or other sensitive information.
  • Maintain data utility by ensuring that excluded columns do not break referential integrity or remove data that is needed for operations.
  • Document which columns are excluded and why to support compliance audits and team collaboration.
  • Review column exclusions periodically to ensure that they align with current data privacy requirements and operational needs.

Creating access definitions by using the API

  • For API usage, ensure that the following prerequisites are in place:
    • You have a valid access token (<accessToken>). For more information, see Retrieving access tokens for API usage.
    • You have the connection profile name (<connProfileName>), schema name (<schemaName>), and table names.

To create an access definition by using the API:

  1. To create an access definition programmatically, send a POST request to https://VM_HOSTNAME:PORT/optim/v1/flow with a pipeline definition that includes source tables and relationships.
    Headers
    
    Authorization: Bearer <accessToken>
    Content-Type: application/json
          

    To obtain the source and target connection profiles (including encrypted passwords) required for the job JSON, use the following API endpoint:

    Replace {name} with the name of the connection profile. Be either the profile owner or an admin to retrieve this information.

    
    curl -k -X GET \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer <accessToken>" \
      "https://VM_HOSTNAME:7725/optim/v1/connprofiles/{name}"
    
    Sample request body
    
    {
      "id": "7e5ce0c7-bb3c-4c11-85d6-db34ea08ee8e",
      "version": "3.0",
      "app_data": {
        "optim": {
          "version": "1.2.0"
        }
      },
      "doc_type": "pipeline",
      "pipelines": [
        {
          "id": "e7fe76ec-662e-44e5-b514-bee48382d13c",
          "name": "Workflow-Sample",
          "nodes": [
            {
              "id": "11ec9ef1-2cc2-4dce-8008-a8ddb7a225c5",
              "op": "connection_profile",
              "type": "binding",
              "inputs": [],
              "outputs": [],
              "connection": {
                "ref": "f4dc15d1-25ee-4a20-800c-caa0d439e451",
                "name": "PgSQL"
              },
              "parameters": {
                "connection_type": "source"
              }
            },
            {
              "id": "28f22b9d-36cb-4eda-ad1a-4af5493d55e8",
              "op": "access_definition",
              "type": "execution_node",
              "app_data": {
                "source_tables": [
                  {
                    "id": "e30ed315-a011-4fd1-9712-a93ad792a7f9",
                    "schema": "public",
                    "database": "optim_test",
                    "table_name": "products",
                    "connection_profile": "f4dc15d1-25ee-4a20-800c-caa0d439e451"
                  }
                ]
              }
            },
            {
              "id": "57aa2b91-8284-4156-a097-1cf6be6369f2",
              "op": "connection_profile",
              "type": "binding",
              "inputs": [],
              "outputs": [],
              "connection": {
                "ref": "6a7b49eb-4de8-4390-9431-91bfe08fd0b6",
                "name": "DB2"
              },
              "parameters": {
                "connection_type": "target"
              }
            },
            {
              "id": "570e391b-2202-4648-a351-59a7c7755801",
              "op": "insert_definition",
              "type": "execution_node",
              "app_data": {
                "table_map": [
                  {
                    "id": "15b4b9a0-611a-4ac0-871f-cf9d8988fee1",
                    "source": {
                      "schema": "public",
                      "database": "optim_test",
                      "table_name": "products",
                      "source_table_id": "e30ed315-a011-4fd1-9712-a93ad792a7f9",
                      "connection_profile": "f4dc15d1-25ee-4a20-800c-caa0d439e451"
                    },
                    "target": {
                      "schema": "demo_target",
                      "database": "testdb",
                      "table_name": "products",
                      "connection_profile": "6a7b49eb-4de8-4390-9431-91bfe08fd0b6"
                    }
                  }
                ],
                "insert_parameters": {
                  "relational_integrity": {
                    "conflict_resolution": "merge"
                  }
                }
              }
            }
          ],
          "app_data": {
            "ui_data": {
              "comments": [],
              "worksheet_type": "accessDefinition"
            }
          },
          "runtime_ref": "optim-runtime:1.2.0"
        }
      ],
      "parameters": {},
      "json_schema": "https://api.dataplatform.ibm.com/schemas/common-pipeline/pipeline-flow/pipeline-flow-v3-schema.json",
      "primary_pipeline": "e7fe76ec-662e-44e5-b514-bee48382d13c"
    }
    
    

    An access definition is created and saved, specifying the tables and relationships for data extraction. You can retrieve, update, or preview the definition by using the corresponding API endpoints.

  2. To preview the access definition and verify relationships, send a GET request to https://VM_HOSTNAME:PORT/optim/v1/flow/preview/{flowId}
    
    curl -X GET https://<host>/optim/v1/flow/preview/{flowId} \
      -H "Authorization: Bearer <accessToken>" 
    

Excluding columns by using the API

By default, all columns are included. To include only specific columns, specify their names in the columns field of the access definition request body. If a column is excluded, no data is read from or written to it. As a result:

  • For database targets, the underlying database populates the column with its default value (typically NULL).
  • For file-based targets, the column is omitted from the generated Parquet files.

Excluded columns remain part of the schema by default, for all targets. To remove them entirely from the target schema, set the exclude_columns_on_create field to "true". This ensures the column is not created in the target schema.

Sample JSON request body:

{
    "id": "access-definition-node",
    "type": "execution_node",
    "op": "access_definition",
    "app_data": {
        "source_tables": [
            {
                "id": "emp-table-id-db2f",
                "database": "${dbName}",
                "table_name": "employee_data",
                "schema": "testschema_db2file_src",
                "connection_profile": "source-db-ref-002",
                "columns": ["emp_id", "first_name", "last_name"],
                "exclude_columns_on_create": "true"
            }
        ]
    }
    ...
}