Storing and persisting metrics
You can use the metrics repository to store metrics in a database.
With the ds-metrics
service, you can set up metrics storage at the project level
for all your DataStage® flows. While metrics are enabled
by default for your flows, you must manually enable persisting metrics at a project level.
Creating a connection
Under the Manage tab of your Cloud Pak for Data project, go to DataStage > Metrics Repository and enable persisting metrics. Specify a connection type, configure properties and security details, and test the connection to verify that it works.
ds-metrics
. To ensure that a database is clean and to clear any previous
ds-metrics
data, run the following
commands.drop schema if exists ds_metrics cascade;
drop table if exists public.databasechangelog;
drop table if exists public.databasechangeloglock;
To use a database with
ds-metrics
, the database user must have both the permission to create schemas in
the database, and the permission to create tables in the public
schema. To check
for permissions to create tables and schemas, log in to the database and run the following query as
the same user who requires the
permissions.select has_database_privilege(current_database(), 'create'), has_schema_privilege('public', 'create');
if
you have create-schema and create-table permissions, the query returns true,
true
.Metrics retention
The Metrics retention field controls how frequently, and if at all,
ds-metrics
removes old data from the database. If you enable the Max
limit option, you can choose between the Days or
Runs settings. For example, if you set 30 days, metrics deletes all
information that is related to the job and to all its runs when a job is over 30 days old. If you
set 100 runs, metrics keeps the 100 most recent runs, and deletes the older runs. The
Metrics retention field applies only to metrics data related to the project
in which the settings are activated. If the database is shared with another project, the retention
settings do not impact the data from that project.
After you save the connection settings, it triggers ds-metrics
to clear its
connection cache. When you run a job in the project, ds-metrics
connects to the
database for the first time. After a connection is made, it populates the
ds-metrics
schema after it is initialized, or migrates it if the schema in the
database is an older version.
Database structure
In the metrics database, most tables are in the ds-metrics
schema.
public.databasechangelog
and public.databasechangeloglock
are
extra internal tables in the public schema that are used to track the database schema versions.
Table schemas
For tables, ds_metrics
schema is used as the main schema.
job
- Stores information about the job when a job run starts.
job_run
- Stores information about the job run while a job is running.
job_run_parameter
- Stores parameters for a job run when a job run starts.
job_run_log
- Stores non-info log lines from a job run when a job run ends.
job_run_stage_metrics
- Stores metrics for stages in a job run while a job is running.
job_run_link_metrics
- Stores metrics for links in a job run while a job is running.
version
- Stores the version of
ds-metrics
that last initialized the database, unless the database was set up manually.
instance
- Stores information about the px-runtime instances.
pod
- Stores information about the px-runtime conductor and compute pods.
pod_disk
- Stores information about disks mounted to the pods.
pod_metrics
- Stores CPU and memory metrics over time.
disk_metrics
- Stores disk size metrics over time.