DataStage repository

You can store DataStage job run metrics and indexed assets in a separate repository.

Prerequisites

PostgreSQL is the designated database for storing metrics data and is not intended as an operational data store. With PostgreSQL, you can run your own queries for insights into job performance and indexed assets. Depending on your preferences, you can host the PostgreSQL database within the same IBM Cloud Pak® for Data environment, operated on a virtual machine, or managed within a PostgreSQL service.

Creating a connection

Under the Manage tab of your Cloud Pak for Data project, go to DataStage > Repository. You can configure the connection to manage the DataStage repository and to enable persisting metrics and asset indexing. Specify a connection type, configure properties and security details, and test the connection to verify that it works.

Note: If you want to enable ds-metrics for your database without create-schema or create-table permissions on non-FIPS clusters, you must first manually initialize the database. For more information, see Setting up ds-metrics on a FIPS-tolerant or FIPS-enabled cluster.

You must either use a clean database or the one that was previously initialized by ds-metrics. To clear any previous DataStage job run metrics data from the ds-metrics schema, run the following commands:

drop schema if exists ds_metrics cascade;
drop table if exists public.databasechangelog;
drop table if exists public.databasechangeloglock;

To clear any previous DataStage indexed asset data from the indexer schema, run the following commands:

drop schema if exists indexer cascade;

The repository username that you specify must have both the permission to create schemas and create tables in the public schema. To check the permissions, connect to the database as the specified repository username and run the following query. If the query returns true, true then the necessary permissions exist:

select has_database_privilege(current_database(), 'create'), has_schema_privilege('public', 'create');

Note: Changes to the repository connection, changing the persisting metrics or asset indexing settings might take several minutes to take effect.

Learn more

Storing DataStage job run metrics in the DataStage repository