What's new and changed in watsonx.ai

IBM watsonx.ai updates can include new features, bug fixes, and security updates. Releases are listed in reverse chronological order so that the latest release is at the beginning of the topic.

You can see a list of the new features for the platform and all of the services at What's new in IBM Cloud Pak for Data.

Installing or upgrading watsonx.ai

Ready to install or upgrade watsonx.ai?

To install watsonx.ai along with the other Cloud Pak for Data services, see Installing Cloud Pak for Data.
To upgrade watsonx.ai along with the other Cloud Pak for Data services, see Upgrading Cloud Pak for Data.
To install or upgrade watsonx.ai independently, see watsonx.ai.
Remember: All of the Cloud Pak for Data components associated with an instance of Cloud Pak for Data must be installed at the same version.

Cloud Pak for Data Version 5.0.3

A new version of watsonx.ai was released in September 2024 with Cloud Pak for Data 5.0.3.

Operand version: 9.3.0

This release includes the following changes:

New features

This release of watsonx.ai includes the following features:

Introducing Mistral AI with IBM, a premium add-on for watsonx.ai that enables the on-premises deployment of Mistral AI foundation models (Mistral Large 2 currently available)

Mistral Large 2 is fluent in and understands the grammar and cultural context of English, French, Spanish, German, and Italian. It also understands dozens of other languages. Mistral Large is effective at programmatic tasks, such as generating, reviewing, and commenting on code. For example, Mistral Large can generate results in JSON format and can make function calls. Mistral AI with IBM is available for purchase as a premium add-on capability to existing customers of the IBM watsonx.ai platform, or to first-time customers who acquire the platform and the add-on capability together. To get started using Mistral AI with IBM, contact your IBM sales representative.

For details about the Mistral Large model, see Supported foundation models.

Fine tune a foundation model to create a new model that is customized for a task

From the Tuning Studio, you can choose to fine tune a foundation model as an alternative method to prompt tuning. Both methods help you to customize a foundation model through supervised learning from training data that you provide. While prompt tuning adjusts only the prompt that is submitted to the foundation model without modifying the underlying model, fine tuning adjusts the parameter weights of the underlying foundation model. The result of fine tuning is a new foundation model that is customized to meet your needs. For details, see Methods for tuning foundation models.

Use training data from Microsoft Azure Databricks in Tuning Studio

You can now train your foundation models in Tuning Studio by importing training data from a separate Microsoft Azure Databricks data source. With the addition of Microsoft Azure Databricks, you can choose from the following data connection types:

Microsoft Azure Databricks connection
Presto connection
IBM watsonx.data connection
IBM Cloud Object Storage connection

For details, see Data formats for tuning foundation models.

Use NVIDIA Multi-Instance GPU (MIG) to partition GPUs for provided foundation models

You can now enable Multi-Instance GPU (MIG) support to scale workloads more efficiently on NVIDIA GPUs that are used for inferencing IBM-provided foundation models. For details about the foundation models that can be inferenced from a multi-instance GPU, see Foundation models in IBM watsonx.

Work with new foundation models in Prompt Lab

You can now use the following foundation models for inferencing from the API and the Prompt Lab in watsonx.ai:

Llama 3.1 models in 8 billion, 70 billion, and 405 billion parameter sizes

For details, see Supported foundation models.

Work with new embedding models for text matching and retrieval tasks

You can now use the following embedding models to vectorize text in watsonx.ai:

all-minim-l6-v2
multilingual-e5-large

For details, see Supported embedding models.

Updates

The following updates were introduced in this release:

New version of the granite-8b-code-instruct foundation model: The granite-8b-code-instruct foundation model was modified to version 2.0.0. The latest modification can handle larger prompts. The context window length (input + output) for prompts increased from 8192 to 128,000. For details, see Supported foundation models.
New versions of the slate-30m-english-rtrvr and slate-125m-english-rtrvr foundation models: The slate-30m-english-rtrvr and slate-125m-english-rtrvr foundation model were modified to version 2.0.1. The new Slate models show significant improvements over their version 1 counterparts. For details, see Supported embedding models.

Security issues fixed in this release

The following security issues were fixed in this release:

CVE-2024-4032, CVE-2024-5321, CVE-2024-6345, CVE-2024-21131, CVE-2024-21138, CVE-2024-21140, CVE-2024-21144, CVE-2024-21145, CVE-2024-21147, CVE-2024-22018, CVE-2024-22020, CVE-2024-24789, CVE-2024-28863, CVE-2024-35235, CVE-2024-35870, CVE-2024-36137, CVE-2024-37891, CVE-2024-38372, CVE-2024-38579, CVE-2024-39338, CVE-2024-39689, CVE-2024-39705, CVE-2024-41041, CVE-2024-41110, CVE-2024-41818, CVE-2024-42110, CVE-2024-42367, CVE-2024-42459, CVE-2024-42460, CVE-2024-42461

CVE-2023-6597, CVE-2023-45288, CVE-2023-52903

CVE-2021-46908

Deprecated features

The following features were deprecated in this release:

Deprecated foundation models

The following models are now deprecated and will be withdrawn in a future release:

llama-2-70b-chat
merlinite-7b

Withdrawn foundation models

The following models are now withdrawn from the watsonx.ai service:

mixtral-8x7b-instruct-v01-q

To avoid disruption, use a suitable alternative model as a replacement. For details, see Foundation model lifecycle.

Cloud Pak for Data Version 5.0.1

A new version of watsonx.ai was released in July 2024 with Cloud Pak for Data 5.0.1.

Operand version: 9.1.0

This release includes the following changes:

New features

This release of watsonx.ai includes the following features:

Deploy custom foundation models with NVIDIA L40S hardware specification: You can now deploy custom foundation models with NVIDIA L40S hardware specification, which provides 48 GB of GPU memory. For details, see Planning to deploy a custom foundation model.

New watsonx.ai lightweight engine

You can now install a lightweight watsonx.ai engine that has a smaller footprint. The lightweight engine uses an optimized runtime that can be embedded in generative AI applications. For details, see Choosing an IBM watsonx.ai installation mode.

Work with new foundation models in Prompt Lab

You can now use the following foundation models for inferencing from the Prompt Lab in watsonx.ai:

Granite code models

Foundation models from the IBM Granite family. The Granite code foundation models are instruction-following models fine-tuned using various code instruction data sets. The models are available in the following sizes:

granite-3b-code-instruct
granite-8b-code-instruct
granite-20b-code-instruct
granite-34b-code-instruct

For details, see Supported foundation models.

Updates

The following updates were introduced in this release:

Use image groups in watsonx.ai to enable targeted installation of a subset of models: You can now create and use image groups when mirroring a watsonx.ai image to an air-gapped cluster. Image groups provide a more selective approach to model installation, enabling you to install only the models that you require. For details, see Mirroring images directly to the private container registry.
Shard large models into smaller units that are processed across multiple GPUs: Use sharding to partition large models into smaller units, known as shards, that can be processed across multiple GPU processors in parallel. Sharding lets you run large models on GPUs that have less memory. For details, see Adding foundation models.

Security issues fixed in this release

The following security issues were fixed in this release:

CVE-2024-0450, CVE-2024-0553, CVE-2024-0985, CVE-2024-2961,CVE-2024-3651, CVE-2024-4453, CVE-2024-5206, CVE-2024-5642, CVE-2024-6239, CVE-2024-6501, CVE-2024-22667, CVE-2024-24786, CVE-2024-24789, CVE-2024-26130, CVE-2024-28182, CVE-2024-33599, CVE-2024-33600, CVE-2024-33601, CVE-2024-33602, CVE-2024-34062, CVE-2024-35195, CVE-2024-37891

CVE-2023-3164, CVE-2023-4408, CVE-2023-4641, CVE-2023-5455, CVE-2023-5868, CVE-2023-5869, CVE-2023-5870, CVE-2023-5981, CVE-2023-6004, CVE-2023-6277, CVE-2023-6597, CVE-2023-6918, CVE-2023-7104, CVE-2023-22745, CVE-2023-24056, CVE-2023-25433, CVE-2023-25434, CVE-2023-28322, CVE-2023-29499, CVE-2023-31486, CVE-2023-32611, CVE-2023-32665, CVE-2023-38546, CVE-2023-39327, CVE-2023-39329, CVE-2023-39742, CVE-2023-42843, CVE-2023-42950, CVE-2023-42956, CVE-2023-46218, CVE-2023-48161, CVE-2023-48231, CVE-2023-48232, CVE-2023-48233, CVE-2023-48234, CVE-2023-48235, CVE-2023-48236, CVE-2023-48237, CVE-2023-48706, CVE-2023-48795, CVE-2023-49083, CVE-2023-49290, CVE-2023-50387, CVE-2023-50782, CVE-2023-50868, CVE-2023-52424, CVE-2023-52426

CVE-2022-33068, CVE-2022-37050, CVE-2022-37051, CVE-2022-37052, CVE-2022-48622, CVE-2022-48564, CVE-2022-48560, CVE-2022-48468, CVE-2022-48339, CVE-2022-48338, CVE-2022-48337, CVE-2022-41862, CVE-2022-40897, CVE-2022-33068, CVE-2022-3094, CVE-2022-2923, CVE-2022-2182

CVE-2021-32815, CVE-2021-34334, CVE-2021-34335, CVE-2021-35937, CVE-2021-35938, CVE-2021-35939, CVE-2021-37615, CVE-2021-37616, CVE-2021-37620, CVE-2021-37621, CVE-2021-37622, CVE-2021-37623, CVE-2021-39537, CVE-2021-43618

CVE-2020-17049, CVE-2020-19188, CVE-2020-20703, CVE-2020-23922, CVE-2020-26137, CVE-2020-28241

CVE-2019-11236

CVE-2017-6519

Deprecated features

The following features were deprecated in this release:

Withdrawn foundation models

The following models are now withdrawn from the watsonx.ai service:

granite-13b-chat-v1
granite-13b-instruct-v1
gpt-neox-20b
mpt-7b-instruct2
starcoder-15.5b

To avoid disruption, use a suitable alternative model as a replacement. For details, see Foundation model lifecycle.

Cloud Pak for Data Version 5.0.0

A new version of watsonx.ai was released in June 2024 with Cloud Pak for Data 5.0.0.

Operand version: 9.0.0

This release includes the following changes:

New features

This release of watsonx.ai includes the following features:

Red Hat® OpenShift® AI is now a prerequisite for watsonx.ai

Watsonx.ai now requires Red Hat OpenShift AI to be installed as a prerequisite foundation layer on the cluster. Red Hat OpenShift AI provides enhanced support for serving generative AI models and improving the efficiency of prompt tuning.

IBM text embedding support for enhanced text matching and retrieval

You can now use the IBM watsonx.ai text embeddings API and IBM embedding models for transforming input text into vectors to more accurately compare and retrieve similar text. You can use the following IBM Slate embedding models:

slate-125m-english-rtrvr: A foundation model provided by IBM that generates embeddings for various inputs such as queries, passages, or documents. The training objective is to maximize cosine similarity between a query and a passage.

slate-30m-english-rtrvr: A foundation model provided by IBM that is trained to maximize the cosine similarity between two text inputs so that embeddings can be evaluated based on similarity later. The slate-30m-english-rtrvr model is a distilled version of the slate-125m-english-rtrvr model.

For details, see Text embedding generation.

Use training data from connected data sources in Tuning Studio

You can now train your foundation models in Tuning Studio by importing training data from a separate data source by using a data connection asset. You can use the following data connection types:

Presto connection
IBM watsonx.data connection
IBM Cloud Object Storage connection

For details, see Data formats for tuning foundation models.

Work with new foundation models in Prompt Lab

You can now use the following foundation models for inferencing from the Prompt Lab in watsonx.ai:

allam-1-13b-instruct: A bilingual large language model for Arabic and English provided by the National Center for Artificial Intelligence and supported by the Saudi Authority for Data and Artificial Intelligence. You can use the allam-1-13b-instruct foundation model for general purpose tasks in the Arabic language, such as classification, extraction, question-answering, and for language translation between Arabic and English.
granite-7b-lab: A foundation model from the IBM Granite family that is tuned with a novel alignment tuning method from IBM Research.
llama-3-8b-instruct: An accessible, open large language model provided by Meta that contains 8 billion parameters and is instruction fine-tuned to support various use cases.
llama-3-70b-instruct: An accessible, open large language model provided by Meta that contains 70 billion parameters and is instruction fine-tuned to support various use cases.
merlinite-7b: A foundation model provided by Mistral AI and tuned by IBM. The merlinite-7b foundation model is a derivative of the Mistral-7B-v0.1 model that is tuned with a novel alignment tuning method from IBM Research.
mixtral-8x7b-instruct-v01: A foundation model that is a pre-trained generative sparse mixture-of-experts network provided by Mistral AI.

For details, see Supported foundation models.

Work with InstructLab foundation models in Prompt Lab

InstructLab is an open-source initiative by Red Hat and IBM that provides a platform for augmenting the capabilities of a foundation model. The following foundation models support knowledge and skills that are contributed from InstructLab:

New models:
- granite-7b-lab
- merlinite-7b
Existing models:
- granite-13b-chat-v2
- granite-20b-multilingual

For details, see InstructLab-compatible foundation models.

Create detached deployments for external prompt templates

You can now deploy a prompt template for an LLM hosted by a third-party provider, such as Google Vertex AI, Azure OpenAI, or AWS Bedrock. Use the deployment to explore evaluations for the output generated by the detached prompt template. You can also track the detached deployment and detached prompt template in an AI use case as part of your governance solution. See Creating a detached deployment for an external prompt.

Use the Node.js SDK to add generative AI function to your applications

This beta release of the Node.js SDK helps you to do many generative AI tasks programmatically, including inferencing foundation models. For more information, see Node.js SDK.

Updates

The following updates were introduced in this release:

Improved storage options for deploying your own custom foundation model: You can now use storage volumes in addition to configuring YAML files to set up the storage required to deploy custom foundation models in watsonx.ai. For more information, see Setting up storage and uploading the model.
New version of the granite-20b-multilingual foundation model: The 1.1.0 version of the granite-20b-multilingual foundation model includes improvements that were gained by applying a novel AI alignment technique to the version 1.0 model using InstructLab. AI alignment involves using fine-tuning and reinforcement learning techniques to guide the model to return outputs that are as helpful, truthful, and transparent as possible. For details, see Supported foundation models.
View the full text of a prompt in Prompt Lab: You can now review the full prompt text that is submitted to a foundation model. This capability is useful when your prompt includes prompt variables or when you work in structured mode or chat mode. For details, see Prompt Lab.
New shortcut to start working on common tasks: You can now start a common task in your project by clicking on a tile in the Start working section in the Overview tab. Use these shortcuts to start adding collaborators and data, and to experiment with and build models. Click View all to jump to a selection of tools.

Issues fixed in this release

The following issue was fixed in this release:

Cannot add a model because earlier installation was not removed fully

Issue: When you try to add a model to the watsonx_ai_ifm service, a CrashLoopBackOff error is displayed.
Resolution: All Persistent Volume Claims(PVCs) associated with a previously uninstalled model are removed.

Security issues fixed in this release

The following security issues were fixed in this release:

CVE-2024-3154, CVE-2024-3568, CVE-2024-3727, CVE-2024-21011, CVE-2024-21068, CVE-2024-21085, CVE-2024-21094, CVE-2024-21890, CVE-2024-21891, CVE-2024-21892, CVE-2024-21896, CVE-2024-22017, CVE-2024-22019, CVE-2024-22025, CVE-2024-23206, CVE-2024-23213, CVE-2024-25629, CVE-2024-27289, CVE-2024-27304, CVE-2024-27306, CVE-2024-27982, CVE-2024-27983, CVE-2024-28757, CVE-2024-28835, CVE-2024-28863, CVE-2024-29041, CVE-2024-30251, CVE-2024-30260, CVE-2024-30261, CVE-2024-34064, CVE-2024-34069

CVE-2024-3727, CVE-2024-4603, CVE-2024-30203, CVE-2024-30204, CVE-2024-30205, CVE-2024-32002, CVE-2024-32004, CVE-2024-32020, CVE-2024-32021, CVE-2024-32465, CVE-2024-34459, CVE-2024-35195

CVE-2023-2975, CVE-2023-3138,CVE-2023-3618, CVE-2023-4752, CVE-2023-5517, CVE-2023-5679, CVE-2023-6228, CVE-2023-6516, CVE-2023-25193, CVE-2023-29491, CVE-2023-32359, CVE-2023-37328, CVE-2023-38469, CVE-2023-38470, CVE-2023-38471, CVE-2023-38472, CVE-2023-38473, CVE-2023-39928, CVE-2023-40414, CVE-2023-40745, CVE-2023-41175, CVE-2023-41983, CVE-2023-42282, CVE-2023-42852, CVE-2023-42883, CVE-2023-42890, CVE-2023-43785, CVE-2023-43786, CVE-2023-43787, CVE-2023-45288, CVE-2023-46809, CVE-2023-47038

CVE-2022-0413, CVE-2022-1674, CVE-2022-33065, CVE-2022-40090, CVE-2022-48554

CVE-2021-3903, CVE-2021-23337, CVE-2021-27290, CVE-2021-29390

CVE-2020-36024

CVE-2017-1000383

CVE-2014-1745

Deprecated features

The following features were deprecated in this release:

Deprecated foundation models

The following models are now deprecated and will be withdrawn in a future release:

mixtral-8x7b-instruct-v01-q

To avoid disruption when the withdrawal is completed, use a suitable alternative model as a replacement. For details, see Foundation model lifecycle.