Interpreting OpenTelemetry traces

OpenTelemetry traces can be used to identify the system from which a problem originates and might also indicate the cause of a problem.

zosConnect-3.0 Applies to zosConnect-3.0.

OpenTelemetry spans indicate how long particular operations take. For z/OS® Connect, if one or more API requests are experiencing a degraded response time, for example, you might observe one of the following:

  • Long times for the request or response-mapping spans. This can occur with large JSON payloads, complex JSONata expressions, and many entries in the request or response mapping file.
  • A long duration in the span invoking a system of record, an API endpoint or access token server. This usually indicates either a problem in the connected system or a problem with the network connection to that system.

Error in z/OS Connect

In an error scenario, fewer z/OS Connect spans might be emitted than in a working scenario, as the request does not complete its entire journey through z/OS Connect. The span in error is marked with the error=true attribute and the exception.message attribute has details of the error. The next step in investigation is to review the error message in the span, identify the error in messages.log at the same timestamp, as other messages that are logged at a similar time might help to identify the problem.

Figure 1. Example span output showing an error in z/OS Connect request mapping
Image to show the impact of an error in z/OS Connect Request mapping with no spans created after the error

How to correlate an OpenTelemetry trace to an SMF 123 record

An OpenTelemetry trace can be correlated to a specific SMF 123 record by using the zos.smf_correlator attribute in the Server span. The value of the zos.smf_correlator attribute matches the value of the SMF123S1_TRACKING_TOKEN or SMF123S2_TRACKING_TOKEN when formatted with the provided Sample JCL to format SMF records.