Monitoring agent analytics

Important: Monitoring agent analytics is currently not supported in on-premises deployment.

Agent analytics in watsonx Orchestrate provides built-in monitoring and observability. It enables builders to track, analyze, and optimize the performance of their AI-driven agents. Key features include the agent activity dashboard, individual agent metrics, trace tables, and debug-level trace details. These tools help you to detect issues, optimize performance, and troubleshoot runtime behavior effectively.

Key purposes and benefits

Agent analytics provides metrics such as total and failed messages, average latency, and trace granularity. You can use the metrics to:

  • Monitor agent performance and usage trends
  • Detect errors and anomalies early
  • Optimize speed and latency
  • Support detailed observability and troubleshooting
  • Provide a data-driven foundation for tuning and evolving agents

Use case example

A builder notices a spike in failed messages and increased latency in the agent analytics page. By tracing individual message logs, they identify a misconfigured API call that is causing timeouts. After correcting the configuration and redeploying the agent, the metrics show improved response times and a drop in failure rates. These new results confirm the fix and validate the analytics as a troubleshooting tool.

Accessing agent analytics

You can access the agent analytics page in two ways to monitor performance and gain insights:

Option 1: From the main menu

  1. From the main menu, click Analyze to open the analytics page.

Option 2: From chat settings

  1. From the main menu, click Chat.
  2. Select Manage agents.
  3. On the Build agents and tools page, click View all under the dashboard.

Agent analytics dashboard
Figure 1. Agent analytics dashboard on the build agents and tools page.

After you click View all, you are redirected to the Agent analytics page.

Agent analytics page
Figure 2. Agent analytics page.

The next sections explain each component in detail

Understanding the analytics dashboard

The dashboard displays key metrics that summarize overall agent performance:

Metrics Description
Total messages The total number of messages processed, including successful and failed responses.
Failed messages The number of messages that returned an error, such as a timeout or 500-level response.
Latency average The average time each agent takes to process a message. High latency might be a signal of configuration or model issues.

These metrics help you to:

  • Detect spikes in failed messages that might indicate errors or broken logic.
  • Monitor response times to identify performance slowdowns.
  • Track message volume to confirm that the agents are active and functioning.

These metrics represent all agents with messages that were run in both the draft and live environments.

Viewing individual agent analytics

Below the dashboard, the agent analytics page lists all agents with their individual performance metrics. This view helps you identify which agents are reliable and which need attention.

The table includes:

Details Description
Name The agent name.
Description A short description of the agent’s purpose.
Messages The total number of processed messages.
Failed messages The number of messages that resulted in errors.
Latency avg The average message processing time.

These metrics support data-driven decisions about where to optimize or troubleshoot.

Use this data to:

  • Compare message volume across agents to understand usage trends.
  • Identify agents with high failure rates or long response times.
  • Confirm whether agent changes improve or degrade performance.

Viewing agent traces

Builders can view the entire conversation history to understand what the user and the agent said at each step. To see how an agent processes individual messages, click the agent in the Name column. You are redirected to another table named Traces.

Agent analytics trace table
Figure 3. Trace table on the agent analytics page.

Each row includes:

Details Description
Timestamp The time the message was processed.
Trace ID A unique ID for the message.
Status Indicates whether the response was "Success" or "Error".
Model The large language model used to generate the response.
Latency Total processing time for the message.

Use this table to spot patterns and narrow down issues at the message level. This trace table gives you visibility into each message’s lifecycle. It helps you to:

  • Identify specific messages that failed and understand why.
  • Monitor how different models or prompts affect latency.
  • Validate expected behavior for test cases or changes.

Trace details

To view the full trace details, click a message status in the Traces table. Another window opens with detailed trace of the message.

Agent analytics trace details
Figure 4. Trace details of the message.

This detailed view helps you to:

  • Trace the exact runtime path that the agent followed.
  • Confirm which model and prompt were used.
  • Understand what inputs, outputs, and results were passed between steps.

Use this information to evaluate if the correct knowledge source was accessed, whether the response made sense, and if performance was acceptable. It’s especially useful for investigating unexpected or inconsistent behavior.

Debug knowledge runtime issues

The debug-level trace details provide deep visibility into the agent’s knowledge workflow. These details help you understand what happened during message processing and diagnose issues such as missing, incorrect, or incomplete responses.

Use trace data to:

  • Review the request and response during the retrieval phase
  • Examine the request and response during the answer generation phase
  • Assess the post-processing status and related metrics

This visibility supports accurate knowledge usage and faster issue resolution.

Accessing debug-level trace details

For an agent with knowledge enabled:

  1. In Service & Operation, go to wxo-server tools.task > Tags.
  2. Scroll to traceloop.entity.output.
  3. Expand artifact > debug.

Agent analytics debug level
Figure 5. Debug-level trace fields for runtime troubleshooting.

Key runtime fields

The debug view includes several categories:

  1. Callout

    • llm
      • generated_token_count: The number of tokens in the generated response.
      • input_token_count: The number of tokens in the input message.
      • model_id: The identifier for the model used.
      • request
        • search_results: The results returned from the knowledge search step.
      • response: The final output that is sent to the user.
        • is_idk_response: The response indicates whether the assistant replied with an “I don’t know” answer.
        • response_type: The type of message returned, such as text or option.
        • text: The response message that is shown to the user.
      • success: The response status shows whether the response was delivered successfully.
  2. Search

    • engine: The runtime search engine used.
    • index: The index targeted in the search.
    • query: The search query issued at run time.
    • request: The search request that is sent to the external service.
      • body: The full content of the request, such as the query and any filters.
      • method: The HTTP method used, such as POST or GET.
      • path: The endpoint path for the request.
      • port: The network port used in the request.
      • url: The full URL of the external service endpoint.
    • response: The output returned by the external service after the request is sent.
      • body: The full content returned in the response, such as retrieved documents or data.
  3. Metrics

    • answer_generation_time_ms: The time taken to generate the response.
    • search_time_ms: The time spent doing the knowledge search.
    • total_time_ms: The total processing time for the full RAG pipeline.
  4. Other

  • is_multi_turn: The previous turn's context is used in generating the current turn's answer.

These details help you pinpoint latency issues, troubleshoot failed searches, and validate model inputs.