Now GA: Monitor agents at runtime with watsonx Orchestrate

This release brings runtime visibility to your live agents, so you can confidently scale agentic automation across your enterprise.

Published 11 December 2025
By Suzanne Livingston

AI agents are moving from prototypes to real, live production workloads. But unlike traditional apps, agents are non-deterministic, multi-step, and tool-driven—so “uptime” alone doesn’t tell you whether they’re behaving reliably, safely or cost-effectively.

That’s why we’re excited to announce the general availability (GA) of agent monitoring in IBM watsonx Orchestrate, powered by integration with IBM watsonx.governance.

What’s new with agent monitoring

Agent monitoring in watsonx Orchestrate allows builders to understand how users are interacting with the agent and how the agent is performing. At the center is the ability to monitor agents at runtime through an intuitive dashboard that surfaces the metrics that matter most once agents go live.

With monitoring, teams can:

  • See real-time agent performance at a glance across individual agents or your entire agent catalog.
  • Track key KPIs, such as success rates, latency, token usage and cost drivers.
  • Drill into transaction-level detail to understand exactly what happened in a specific user-agent interaction.
  • Detect issues early using trace-backed signals that highlight where flows break, tools misfire or performance degrades.

Why monitoring matters for agents

Traditional monitoring assumes deterministic systems. Agents are different in 3 ways:

  • They chain LLM calls, retrieval steps and tools, so errors can appear anywhere in the reasoning path.
  • Their output quality can drift over time as data, prompts or underlying models change.
  • Their costs can spike unexpectedly depending on conversation paths and tool usage.

Production monitoring gives you “runtime truth,” not just uptime. You can verify that agents remain accurate, safe, and efficient while they’re serving real users, and you can respond quickly when they don’t.

How runtime monitoring works

Once an agent is deployed to production, watsonx Orchestrate automatically captures rich telemetry from every interaction. The runtime monitoring UI organizes that data into 3 layers:

  1. Agent-level health: A summary view across deployed agents helps you quickly spot which agents are thriving and which need attention—based on usage volume, success/failure rates and latency patterns.
  2. Message- and tool-level insights: Time-series charts show how performance evolves across turns and tool calls—so you can pinpoint whether slowdowns or failures come from the LLM, retrieval, or a downstream system.
  3. Conversation-level insights: See how agents perform across full multi-turn conversations—not just single interactions. Aggregate views highlight where users drop off and which turns drive failures or escalations. This helps teams spot recurring friction patterns, measure end-to-end success and prioritize improvements that raise real-world task completion.

Agent Ops to manage the full agent lifecycle

Runtime monitoring is a key piece of our Agent Ops strategy and a critical part of the agent development lifecycle. Agent Ops brings the ability to observe, evaluate and optimize secure, compliant agents—built anywhere, running everywhere.

  • Observability: Customers need clear visibility into how their agents behave—what steps are executed, which tools are invoked, and what responses are produced—both during development and in production.  
  • Evaluation: Pre-production stress testing is still manual and brittle; teams need an easy and automated way to test their agent, against a variety of use cases and across an array of metrics—from faithfulness to tool call relevancy to token count.
  • Optimization: Organizations struggle to make insight out of all this data. To scale at speed, they need AI-driven issue detection, root-cause analysis, and actionable recommendations to improve agent and cost performance.

Get started

Scale agentic automation across your enterprise confidently and securely with monitoring for your agents in production.

