Available AI models
Learn about the AI models available for your agents, including regional availability and provider options.
watsonx Orchestrate supports both IBM-hosted and third-party AI models. Model availability varies by cloud provider, region, and deployment type (AWS, IBM Cloud, or AWS GovCloud (US)).
- This page covers AI models for AWS, IBM Cloud, and AWS GovCloud (US) deployments. For On-premises deployment information, see Foundation Models.
- Deprecated models remain functional until their removal date but may display removal notifications in the UI. Switch to supported models before removal to avoid disruption.
If you're using deprecated models, follow the steps in Migration guidance to migrate to available models.
Model availability by region
Most regions use the provided GPT-OSS 120B — OpenAI model as the default model unless otherwise specified. Extend available models by connecting to custom models.
AWS deployments
| Region | Region code | Available models |
|---|---|---|
| N. Virginia | us-east-1 | GPT-OSS 120B (default) |
| Frankfurt | eu-central-1 | GPT-OSS 120B (default) |
| Singapore | ap-southeast-1 | GPT-OSS 120B (default) |
| Mumbai | ap-south-1 | GPT-OSS 120B (default) |
IBM Cloud deployments
| Region | Region code | Available models |
|---|---|---|
| Dallas | us-south | GPT-OSS 120B (default) |
| Toronto | ca-tor | GPT-OSS 120B (default) |
| London | eu-gb | GPT-OSS 120B (default) |
| Frankfurt | eu-de | GPT-OSS 120B (default) |
| Sydney | au-syd | GPT-OSS 120B (default) |
| Tokyo | jp-tok | GPT-OSS 120B (default) |
AWS GovCloud (US) deployments
AWS GovCloud (US) deployments use watsonx.ai models for specialized AI tasks including document retrieval, content moderation, and time series analysis. AWS GovCloud (US) deployments are available on US-East (us-gov-east-1) region.
| Model name | Purpose |
|---|---|
ibm/granite-3-3-8b-instruct |
General-purpose instruction following |
ibm/granite-guardian-3-8b |
Content moderation and safety |
meta-llama/llama-4-maverick-17b-128e-instruct |
Advanced reasoning with mixture-of-experts |
meta-llama/llama-3-2-90b-vision-instruct |
Multimodal (text + images) |
meta-llama/llama-3-3-70b-instruct |
High-performance text generation |
ibm/slate-30m-english-rtrvr-v2 |
Lightweight retrieval |
ibm/slate-125m-english-rtrvr-v2 |
Enhanced retrieval |
intfloat/multilingual-e5-large |
Multilingual embeddings |
cross-encoder/ms-marco-minilm-l-12-v2 |
Ranking and reranking |
ibm/granite-embedding-278m-multilingual |
Multilingual embeddings |
ibm/granite-ttm-1024-96-r2 |
Time series modeling |
ibm/granite-ttm-1536-96-r2 |
Time series modeling |
ibm/granite-ttm-512-96-r2 |
Time series modeling |
GPT-OSS 120B model
GPT-OSS 120B is a high-performance model optimized for speed, tool calling, and multilingual support. This model is available through two providers: GroqCloud (default) and AWS Bedrock.
- For the
GPT-OSS 120B — OpenAI (via Groq)model, inference workloads route through GroqCloud LPU infrastructure for optimal performance. EU workloads process exclusively within EU data centers for GDPR compliance. - These models are governed by a third-party license. By using this model, you agree to comply with the license terms. Read the terms.
- Customer data is never stored, accessed, or used for model training.
Model specifications
| Property | Details |
|---|---|
| Creator | OpenAI |
| Providers | GroqCloud (default), AWS Bedrock |
| Model ID | gpt-oss-120b |
| Modality | Text only |
| Context window | 131,072 tokens (input + output combined) |
| Supported languages | English (primary), multilingual support |
Provider selection
GroqCloud (default)
- Fastest response times with LPU infrastructure.
- EU workloads processed within EU data centers.
- Optimized for real-time interactions.
AWS Bedrock
- Alternative for AWS infrastructure requirements.
- Regional availability alignment with AWS services.
- Deliver comparable model capabilities and performance.
Results from the providers might not be identical as each provider implements their operational characteristics.
Use cases
- Real-time chat and conversational AI
- High-volume automation workflows
- Intent detection and routing
- Domain classification
- Lightweight agents with tool usage
Key capabilities
- Speed: Fast response times with LPU infrastructure
- Accuracy: Strong routing and classification performance
- Cost efficiency: Optimized for large-scale deployments
- Flexibility: Two provider options for infrastructure needs
Add custom models
Extend available models by connecting your own providers through AI Gateway. For instructions, see Adding AI models through AI Gateway.
Migration guidance
If you're using deprecated models, follow these steps to migrate:
- Identify affected agents: Review which agents use deprecated models
- Test with GPT-OSS 120B: Evaluate performance with the recommended alternative
- Update agent instructions: GPT-OSS requires explicit instruction formatting (see GPT-OSS model behavior and instruction guidelines)
- Monitor performance: Compare response quality and latency
For detailed deprecation information, see Deprecated and withdrawn models.