Available AI models

Learn about the AI models available for your agents, including regional availability and provider options.

watsonx Orchestrate supports both IBM-hosted and third-party AI models. Model availability varies by cloud provider, region, and deployment type (AWS, IBM Cloud, or AWS GovCloud (US)).

Note:
  • This page covers AI models for AWS, IBM Cloud, and AWS GovCloud (US) deployments. For On-premises deployment information, see Foundation Models.
  • Deprecated models remain functional until their removal date but may display removal notifications in the UI. Switch to supported models before removal to avoid disruption.

    If you're using deprecated models, follow the steps in Migration guidance to migrate to available models.

Model availability by region

Most regions use the provided GPT-OSS 120B — OpenAI model as the default model unless otherwise specified. Extend available models by connecting to custom models.

AWS deployments

Region Region code Available models
N. Virginia us-east-1 GPT-OSS 120B (default)
Frankfurt eu-central-1 GPT-OSS 120B (default)
Singapore ap-southeast-1 GPT-OSS 120B (default)
Mumbai ap-south-1 GPT-OSS 120B (default)

IBM Cloud deployments

Region Region code Available models
Dallas us-south GPT-OSS 120B (default)
Toronto ca-tor GPT-OSS 120B (default)
London eu-gb GPT-OSS 120B (default)
Frankfurt eu-de GPT-OSS 120B (default)
Sydney au-syd GPT-OSS 120B (default)
Tokyo jp-tok GPT-OSS 120B (default)

AWS GovCloud (US) deployments

AWS GovCloud (US) deployments use watsonx.ai models for specialized AI tasks including document retrieval, content moderation, and time series analysis. AWS GovCloud (US) deployments are available on US-East (us-gov-east-1) region.

Important: GPT-OSS 120B is not available in AWS GovCloud (US) deployments.
Note: Configure watsonx.ai models for AWS GovCloud (US) using ADK only. For instructions about how to import models to your instance, see CLI reference.
Table 1. Available models from watsonx.ai
Model name Purpose
ibm/granite-3-3-8b-instruct General-purpose instruction following
ibm/granite-guardian-3-8b Content moderation and safety
meta-llama/llama-4-maverick-17b-128e-instruct Advanced reasoning with mixture-of-experts
meta-llama/llama-3-2-90b-vision-instruct Multimodal (text + images)
meta-llama/llama-3-3-70b-instruct High-performance text generation
ibm/slate-30m-english-rtrvr-v2 Lightweight retrieval
ibm/slate-125m-english-rtrvr-v2 Enhanced retrieval
intfloat/multilingual-e5-large Multilingual embeddings
cross-encoder/ms-marco-minilm-l-12-v2 Ranking and reranking
ibm/granite-embedding-278m-multilingual Multilingual embeddings
ibm/granite-ttm-1024-96-r2 Time series modeling
ibm/granite-ttm-1536-96-r2 Time series modeling
ibm/granite-ttm-512-96-r2 Time series modeling

GPT-OSS 120B model

GPT-OSS 120B is a high-performance model optimized for speed, tool calling, and multilingual support. This model is available through two providers: GroqCloud (default) and AWS Bedrock.

Important:
  • For the GPT-OSS 120B — OpenAI (via Groq) model, inference workloads route through GroqCloud LPU infrastructure for optimal performance. EU workloads process exclusively within EU data centers for GDPR compliance.
  • These models are governed by a third-party license. By using this model, you agree to comply with the license terms. Read the terms.
  • Customer data is never stored, accessed, or used for model training.

Model specifications

Property Details
Creator OpenAI
Providers GroqCloud (default), AWS Bedrock
Model ID gpt-oss-120b
Modality Text only
Context window 131,072 tokens (input + output combined)
Supported languages English (primary), multilingual support

Provider selection

GroqCloud (default)

  • Fastest response times with LPU infrastructure.
  • EU workloads processed within EU data centers.
  • Optimized for real-time interactions.

AWS Bedrock

  • Alternative for AWS infrastructure requirements.
  • Regional availability alignment with AWS services.
  • Deliver comparable model capabilities and performance.

Results from the providers might not be identical as each provider implements their operational characteristics.

Use cases

  • Real-time chat and conversational AI
  • High-volume automation workflows
  • Intent detection and routing
  • Domain classification
  • Lightweight agents with tool usage

Key capabilities

  • Speed: Fast response times with LPU infrastructure
  • Accuracy: Strong routing and classification performance
  • Cost efficiency: Optimized for large-scale deployments
  • Flexibility: Two provider options for infrastructure needs
Note: GPT-OSS behaves differently from other models. Agent styles selected in the UI do not affect behavior. Define tone, structure, and response patterns explicitly in agent instructions. For details, see GPT-OSS model behavior and instruction guidelines.

Add custom models

Extend available models by connecting your own providers through AI Gateway. For instructions, see Adding AI models through AI Gateway.

Migration guidance

If you're using deprecated models, follow these steps to migrate:

  1. Identify affected agents: Review which agents use deprecated models
  2. Test with GPT-OSS 120B: Evaluate performance with the recommended alternative
  3. Update agent instructions: GPT-OSS requires explicit instruction formatting (see GPT-OSS model behavior and instruction guidelines)
  4. Monitor performance: Compare response quality and latency

For detailed deprecation information, see Deprecated and withdrawn models.