IBM watsonx.ai | Pricing

watsonx.ai pricing

Explore the pricing tiers for our trial, essentials and standard plans on IBM® watsonx.ai®. For model pricing, explore IBM's foundation and embedding model section, as well as third-party foundation and embedding models pricing.

Foundation models from IBM

Includes pay-as-you-go pricing per million tokens and hourly rates for on-demand model hosting and deployment.

Learn more

Embedding models

Includes IBM and third-party models available for USD 0.10 per million tokens.

Learn more

Third-party foundation models

Includes third-party models from Meta, Google, DeepSeek, Mistral, and more, with pay-as-you-go pricing per million tokens and hourly options for on-demand hosting and deployment.

Learn more

Use case specific pricing

Includes use case-based pricing for machine learning, text extraction, and model customization, with Essential and Standard package options.

Learn more

Pricing plans — multiple payment options available

Free

Toolbox playground

Foundation Models: Up to 300,000 tokens per month

Machine Learning Tools: Up to 20 Compute Usage Hours (CUH) per month

Text Extraction: Up to 100 documents per month

Start your free trial

Essentials (Pay-as-you-go)

Production deployments

Starting at USD 0/month^*

Model price breakdown^***

Feature-specific price breakdown^**

Standard (Pay-as-you-go)

Enterprise production

Starting at USD 1110/month*

Model price breakdown***

Feature-specific price breakdown**

Playground UI

Inferencing

Open source models

IBM watsonx® models

Work with foundational models (PromptLab)

Supports retrieval augmented generation (RAG)

Work with agents (AgentLab)

Synthetic data generator

ML functionality^**

Text extraction^**

LoRA/QLoRA Fine-tuning^*

Custom foundation models^***

Model hosting^***

Deploy on-demand models^***

Support

watsonx community and online chatbot

Basic support included: 24x7 access to tech support through cases

Options available

Advanced support with SLAs available starting at USD 200 per month

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Contact us

IBM Foundation Models

Model name

Pay as you go

Per million tokens

Model hosting/Deploy on demand

Per hour Prices based on GPU config

granite-4-h-small

USD 0.06 per 1M tokens input / USD 0.25 per 1M tokens output

Not available

granite-vision-3-3-2b

Not available

granite-vision-3-2-2b¹

USD 0.10

Not available

granite-3-2b-instruct (v3.1)¹

USD 0.10

Not available

granite-guardian-3-2b (v3.1)¹ (Deprecated)

USD 0.10

Not available

granite-guardian-3-8b (v3.1)¹

USD 0.20

Not available

granite-timeseries-ttm-r2¹

USD 0.38

Not available

granite-13b-instruct¹ (Deprecated)

USD 0.60

Not available

granite-3-8b-instruct (v3.1)

USD 0.20

Not available

granite-8b-code-instruct

USD 0.20

granite-3-2-8b-instruct

USD 0.20

granite-3-1-8b-base

Not available

granite-20b-code-base-sql-gen¹

Not available

granite-20b-code-base-schema-linking¹

Not available

granite-3-8b-base¹

Not available

granite-7b-lab¹

Not available

granite-8b-japanese¹

Not available

granite-4-1-3b

Not available

granite-4-1-8b

Not available

granite-4-1-30b

Not available

granite-vision-4-1-4b

Not available

granite-4-h-small

USD 0.0636 per 1M tokens input

USD 0.265 per 1M tokens output

granite-4-h-tiny

Not available

granite-4-h-micro

Not available

granite-vision-3-3-2b

Not available

granite-3-3-8b-instruct

Not available

granite-3-3-2b-instruct

Not available

granite-3-1-8b-base

Not available

granite-3-8b-base

Not available

granite-7b-lab

Not available

granite-guardian-3-8b

USD 0.212

granite-timeseries-ttm-r2

USD 0.1378 per 1M data points input

USD 0.4028 per 1M data output output

granite-8b-code-instruct

USD 0.636

granite-8b-japanese

Not available

granite-20b-multilingual

Not available

granite-3b-code-instruct

Not available

granite-8b-code-instruct

Not available

granite-20b-code-instruct

Not available

granite-34b-code-instruct

Not available

granite-20b-code-base-schema-linking

Not available

granite-20b-code-base-sql-gen

Not available

granite-13b-chat-v2

Not available

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Embedding models

All embedding models are USD 0.106 per million tokens. This includes IBM models (granite-embedding-278m-multilingual, slate-125m-english-rtrvr-v2, slate-30m-english-rtrvr-v2) and third-party models (all-mini-l6-v2 and multilingual-e5-large). The reranking model (ms-marco-minilm-l-12-v2) is USD 0.005 per million tokens.

Third-party foundation models

Model name

Provider

Pay as you go

Per million tokens

Model hosting/Deploy on demand^

Per hour Prices based on GPU config

llama-4-maverick-17b-128e-instruct-int4

Meta

Not available

llama-4-maverick-17b-128e-instruct-fp8

Meta

USD 0.35 tokens input

USD 1.40 tokens output

Not available

llama-3-2-1b-instruct

Meta

USD 0.10

Not available

llama-3-2-3b-instruct

Meta

USD 0.15

Not available

llama-3-2-90b-vision-instruct

Meta

USD 2.00

Not available

llama-3-405b-instruct

Meta

USD 5.00 tokens input

USD 16.00 tokens output

Not available

llama-guard-3-11b-vision

Meta

USD 0.35

Not available

mistral-medium-2505

Mistral AI

USD 3.00 input

USD 10.00 tokens output

Not available

mistral-large-2²(Deprecated)

Mistral AI

USD 3.00 tokens input

USD 10.00 tokens output

Not available

mistral-small-3-1-24b-instruct-2503²

Mistral AI

USD 0.10 input

USD 0.30 output

Not available

pixtral-12b²(Deprecated)

Mistral AI

USD 0.35

Not available

llama-3-3-70b-instruct

Meta

USD 0.71

flan-t5-xl-3b

Deprecated

Google

USD 0.60

allam-1-13b-instruct

SDAIA

USD 1.80

gpt-oss-120b

Open AI

USD 0.15 tokens input

USD 0.60 tokens output

llama-3-2-11b-vision-instruct

Meta

USD 0.35

llama-3-13b-chat (Deprecated)

Meta

USD 0.0006 /1,000 tokens for input and output

deepseek-r1-distill-llama-70b

DeepSeek

Not available

deepseek-r1-distill-llama-8b

DeepSeek

Not available

eurollm-1-7b-instruct

Utter Project

Not available

eurollm-9b-instruct

Utter Project

Not available

llama-2-70b-chat

Meta

Not available

llama-3-1-70b

Meta

Not available

llama-3-1-8b

Meta

Not available

llama-3-3-70b-instruct-hf

Meta

Not available

mistral-large-instruct-2411²

Mistral AI

Not available

mistral-nemo-instruct-2407²

Mistral AI

Not available

mixtral-8x7b-base²

Mistral AI

Not available

poro-34b-chat

LumiOpen

Not available

allam-1-13b-instruct

SDAIA

USD 1.908

codellama-34b-instruct-hf

Code Llama

Not available

codestral-2501

Mistral AI

Not available

deepseek-r1-distill-llama-70b

DeepSeek

Not available

deepseek-r1-distill-llama-8b

DeepSeek

Not available

eurollm-1-7b-instruct

Utter Project

Not available

eurollm-9b-instruct

Utter Project

Not available

gpt-oss-120b

Open AI

USD 0.1590 per 1M tokens input

USD 0.636 per 1M tokens output

gpt-oss-20b

Open AI

Not available

llama-2-70b-chat

Meta

Not available

llama-3-1-405b-instruct-fp8

Meta

Not available

llama-3-1-70b

Meta

Not available

llama-3-1-70b-gptq

Meta

Not available

llama-3-1-70b-instruct

Meta

Not available

llama-3-1-8b

Meta

Not available

llama-3-1-8b-instruct

Meta

Not available

llama-3-1-nemotron-ultra-253b-v1-fp8

NVIDIA

Not available

llama-3-2-11b-vision-instruct

Meta

USD 0.371

llama-3-2-90b-vision-instruct

Meta

Not available

llama-3-3-70b-instruct

Meta

USD 0.7526

llama-3-3-70b-instruct-hf

Meta

Not available

llama-3-70b-instruct

Meta

Not available

llama-3-8b-instruct

Meta

Not available

llama-4-maverick-17b-128e-instruct-fp8

Meta

USD 0.371 per 1M tokens input

USD 1.484 per 1M tokens output

llama-4-maverick-17b-128e-instruct-int4

Meta

Not available

llama-4-scout-17b-16e-instruct-fp8-dynamic

Meta

Not available

llama-guard-3-11b-vision

Meta

USD 0.371

ministral-3b-instruct-2512

Mistral AI

Not available

ministral-8b-instruct-2410

Mistral AI

Not available

ministral-8b-instruct-2512

Mistral AI

Not available

mistral-large-2512

Mistral AI

USD 0.636 per 1M tokens input

USD 1.908 per 1M tokens output

mistral-large-instruct-2407

Mistral AI

Not available

mistral-large-instruct-2411

Mistral AI

Not available

mistral-medium-2505

Mistral AI

USD 3.37 per 1M tokens input

USD 10.07 per 1M tokens output

mistral-medium-2508

Mistral AI

Not available

mistral-nemo-instruct-2407

Mistral AI

Not available

mistral-small-3-1-24b-instruct-2503

Mistral AI

USD 0.106 per 1M tokens input

USD 0.318 per 1M tokens output

mixtral-8x7b-base

Mistral AI

Not available

mixtral-8x7b-instruct-v01

Mistral AI

Not available

mt0-xxl-13b

BigScience

Not available

nvidia-nemotron-3-super-120b-a12b-fp8

NVIDIA

Not available

nvidia-nemotron-nano-12b-v2-vl-fp8

NVIDIA

Not available

pixtral-12b

Mistral AI

Not available

poro-34b-chat

LumiOpen

Not available

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Features specific pricing

Use case

Essentials plan

Standard plan

Machine learning models

0.55 USD / Capacity Unit-Hour

0.45 USD / Capacity Unit-Hour

Text extraction³

0.0403 USD / Page

0.0318 USD / Page

LoRA fine-tuning

Not available

NVIDIA 1 x A100 GPU: 6.3 USD / Hour

NVIDIA 1 x H100 GPU: 14.85 USD / Hour

Model hosting/Deploy on demand

Not available

NVIDIA 1 x L40S GPU: 4.43 USD / Hour

NVIDIA 2 x L40S GPU: 8.86 USD / Hour

NVIDIA 1 x A100 GPU: 5.8 USD / Hour

NVIDIA 2 x A100 GPU: 11.6 USD / Hour

NVIDIA 4 x A100 GPU: 23.2 USD / Hour

NVIDIA 8 x A100 GPU: 46.4 USD / Hour

NVIDIA 1 x H100 GPU: 14.5 USD / Hour

NVIDIA 2 x H100 GPU: 29 USD / Hour

NVIDIA 4 x H100 GPU: 58 USD / Hour

NVIDIA 8 x H100 GPU: 116 USD / Hour

NVIDIA 1 x H200 GPU: 16 USD / Hour

NVIDIA 2 x H200 GPU: 32 USD / Hour

NVIDIA 4 x H200 GPU: 64 USD / Hour

NVIDIA 8 x H200 GPU: 128 USD / Hour

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Take the next step

Try watsonx.ai at no cost or continue your journey of discovery.

More ways to explore

Become an IBM Business Partner

Connect with the IBM Community

Support

Footnotes

¹For foundation model inference, charges are based on a Resource Unit (RU) metric equivalent to 1000 tokens (including both input and output tokens). 

² Mistral commercial models have a GPU hosting fee and a model access fee. For more information, view the documentation.

^* Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

^{^} Capacity Unit Hour pricing depends on the environment and tools utilized within a billing month.

³ Unless otherwise specified under Software pricing, all features, capabilities, and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities will be the same.