Metric tracing tags

IBM® Cloud Pak for Network Automation microservices produce metric data that contains tracing tags that can be used to categorize intent requests.

Categorizing the metrics that are produced by intent operations can be useful if you need to troubleshoot or monitor the performance of your system. For example, if you have multiple client applications that are making REST requests to IBM Cloud Pak for Network Automation, you might want to group the intents that are sent from each of the clients to check how much load that each client is putting on the system.

Adding tag headers to the tracing context

The microservices produce timers, counters, and gauges, collectively known as metrics, for each intent request. If you want to categorize your metrics, you can add custom tag headers to the tracing context for your original REST API requests.
Note: Only the native IBM Cloud Pak for Network Automation APIs support the processing of tracing tags. If you add the tag headers to the tracing context for other APIs, such as SOL 005 APIs, the metric data that is produced does not contain tracing tags.

Any header in the tracing context that uses the "x-tracectx-tag-<name>": "<value> format is written to the metrics for the intent request. You can add multiple tag headers to the tracing context. The trace tags are added to a single label in the metrics, named traceTags and the tags are sorted alphabetically. The traceTags label contains a comma-separated list of the name-value pairs of the corresponding tracing content headers.

In the following examples, tag headers are added to the tracing context so that the metrics can be categorized:
  • If you want to categorize the metrics for a particular operation by the request source and the reason for the request, you might add the following tag headers to the tracing context:
    "x-tracectx-tag-source": "portal"
    "x-tracectx-tag-reason": "order"
    For these tag headers, the value "source:portal,reason:order" is written to the traceTags label in the metrics. The metric output might look like this:
    assembly_create_seconds_sum{application="Galileo",instance="Galileo:cp4na-o-galileo-0",root="lm",
                                server="worker1.abc.example.com",stage="total",traceTags="source:portal,reason:order",} 5.0
  • If you want to categorize successful rollback requests by the reason for the rollback, you might add tag headers like this to the tracing context:
    "x-tracectx-tag-rollback-reason": "too_long_to_resolve"
    When the rollback is successful, the intents metric count is incremented and the following traceTags label is added to the metric:
    "tag":"traceTags","values":["","rollback-reason:too_long_to_resolve"]
Warning - do not use unique IDs as tracing tags:

Using a unique identifier, such as the transaction ID, as a tracing tag might cause your system to run out of memory and your microservices to restart.

A separate line is written to the Prometheus output for each combination of tags and values. If you use a unique ID as a tracing tag, a new line is written to the output for each request, which might cause your system to run out of memory.

Accessing the metrics

You can access the metrics through the Prometheus endpoint on each microservice and use tools such as the Grafana cluster monitoring tool to categorize the metrics, based on the tracing tags. See Categorizing metrics.