Known issues and limitations

Get a quick overview of the known issues and limitations for IBM® watsonx Assistant for Z. As watsonx Assistant for Z is powered by watsonx Orchestrate, this section includes links to known issues of the underlying technology.

Note: For known issues and limitations specific to IBM Software Hub, click here.

zRAG agent fails to access watsonx Orchestrate credentials and filters

When you deploy the zRAG agent in watsonx Orchestrate, it fails to access connection credentials or dynamically update filters at runtime.

Topology service requests fail with 403 when z/OSMF CSRF protection is enabled

Topology service requests to z/OSMF fail with an HTTP 403 response when z/OSMF Cross‑Site Request Forgery (CSRF) protection is enabled, which is the default configuration. The requests do not include the required CSRF headers (X‑CSRF‑ZOSMF‑HEADER or Referer), resulting in authorization failures. As a result, topology service interactions currently succeed only when CSRF protection is disabled

Incomplete responses in multi-turn conversations

Multi‑turn conversations in watsonx Assistant for Z are natural, back‑and‑forth interactions where the assistant collects information step by step, instead of gathering all details in a single form.

During multi-turn conversations with the zRAG agent, users may see the message: “This message was not completed. Please try again.” This typically occurs on the Assistant interface after multiple follow-up queries.

This behavior is due to a current limitation of the Granite 3.3 model when handling extended multi-turn interactions. As the conversation progresses, the model may show inconsistent behavior in invoking tools—especially those that require structured inputs. From the third query onward, this can result in skipped or incomplete tool calls.

Since this limitation is inherent to the model’s instruction-following and context-resolution behavior within the existing controller flow, it cannot be reliably resolved through prompt improvements alone

Workaround:

Start a new conversation session in the Assistant UI and continue your interaction there.

Token limit for conversation context

When the total conversation context exceeds approximately 32,000 tokens, watsonx Assistant for Z returns an HTTP 400 (Bad Request) error with the following message: “Input Tokens Exceed Model Maximum (32768)”.

This limitation is observed in environments that use Spyre or GPU accelerators with the granite‑3.3‑8b‑instruct model.

Workaround:
When the token limit exceeds, the current conversation cannot recover because the accumulated message history continues to exceed the maximum context length. To continue working:
  • Close the current chat and start a new one.
  • Limit each session to one or two job investigations only to prevent the conversation context from exceeding the token limit.

Content ingestion

The following are the recommended limits for ingesting files.
Component Limitation
File size 20 MB per file
File types for remote S3 source PDF, HTML, DOCX, CSV, XLS, XLSX, and PPTX
401 Unauthorized Error in Provider or Content-Ingestion Pods

In some cases, the provider pod in the shared namespace (wxa4z-zad) or the content-ingestion pod in the tenant namespace may encounter a 401 Unauthorized error. This issue typically indicates a problem with authentication credentials used by the pods.

High ingestion time for large files (≥20 MB)

Content ingestion for large files (≥ 20 MB) is currently experiencing significantly higher processing times than expected. In some cases, ingestion can take more than 5 hours to complete.

Workaround:

To resolve this issue, do the following:
  1. Delete the wxa4z-ingestion-credentials secret in the secrets section.

    The operator recreates it automatically.

  2. Delete the affected pod, either the provider pod in the shared namespace or the content-ingestion pod in the tenant namespace, depending on where the issue occurs.
  3. Allow the pod to restart.

    The new pod starts without issues.

Although the ingestion process eventually completes successfully, the prolonged duration falls outside acceptable performance limits and indicates a performance limitation within the ingestion workflow.

Agent fails to start while waiting for wxa4z-client-ingestion

Agents that use the Content Ingestion capability may get stuck during startup after install or upgrade. The agent blocks during startup while waiting for the wxa4z-client-ingestion service to respond; if the ingestion service is not ready or is in an inconsistent state, the agent remains stuck and does not complete startup.

In this state, the agent pod appears in a Running state, but the application server does not start. The agent is not reachable from AI Assistant, and pod logs show ingestion starting but not completing.

zChatOps MCP server

When you deploy the zchatops-mcp-server pod, it initially appears in the CreateContainerConfigError state because the required secrets are not yet present. See Deploying zChatOps MCP Server on your cluster.

zRAG Agent bootstrap job fails during deployment

When deploying the zRAG agent Helm chart, the bootstrap job fails, and the agent is not imported into watsonx Orchestrate. In the bootstrap job logs (and WxO MCP logs), you may see errors such as:
Traceback (most recent call last):
  File "/app/server-lite.py", line 1127, in execute_remote_mcp
    raise HTTPException(status_code=500, detail=f"TaskGroup error occurred: {str(error_message)}")
fastapi.exceptions.HTTPException: 500: TaskGroup error occurred: All connection attempts failed
Impact:
  • The bootstrap job does not complete successfully.
  • The zRAG agent is not created or imported into watsonx Orchestrate, even though Kubernetes resources may be deployed.
  • Any functionality that relies on the auto-imported agent will not work until the agent is manually created.
Workaround:

Turn off the bootstrap job in the Helm values. Then, manually add the zRAG agent through the watsonx Orchestrate UI. See zRAG Agent bootstrap job fails during deployment - Workaround for detailed instructions.