Choosing an IBM watsonx.ai installation mode

Depending on the size and scale of your deployment, you can install the full IBM watsonx.ai service or the watsonx.ai™ lightweight engine.

Installation modes

You can install IBM watsonx.ai in one of the following modes:

Full service: Installs the full IBM watsonx.ai service, which is a studio of integrated tools for experimenting, building, tuning, and evaluating generative AI solutions securely.
Lightweight engine: Installs the watsonx.ai lightweight engine, which is a service for inferencing foundation models programmatically.

Architecture comparison

The recommended watsonx.ai lightweight engine installation follows the standard platform architecture that includes the following components for high availability:

Load balancer: 1
Control plane nodes: 3
Worker nodes: At least 3 plus any nodes that are required to host installed large language models, where each worker node has a minimum of 8 CPU cores. The number of required extra nodes is sized per foundation model.

You can also install the watsonx.ai in the full service or lightweight engine mode on a single node OpenShift® (SNO) cluster if high availability is not a requirement for the deployment.

For more information, read the following topics:

The following diagram describes the components that comprise the full IBM watsonx.ai service versus the watsonx.ai lightweight engine. Shows the components that are included in watsonx.ai depending on installation mode. The same information is included in the page text after the image.

As shown in the diagram, both installation modes include the following components:

Red Hat® OpenShift AI platform layer that supports hosting and inferencing foundation models.
IBM Software Hub platform which provides capabilities such as a command-line installer, auditing, licensing, logging, and more.
Generative AI APIs for inferencing foundation models and vectoring text programmatically.
AI Guardrails for filtering personally-identifiable information and hate, abuse, and profanity.

A full IBM watsonx.ai service installation includes the following extra components:

Common core services that help you manage your work and data with projects, deployment spaces, job management, and more.
Generative AI tools such as the Prompt Lab and Tuning Studio.
Watson Studio and Watson Machine Learning services for tuning and deploying foundation models, deploying prompt templates, building and running Python notebooks, and more.
Integration with IBM watsonx.governance™.

Feature comparison

The following table shows features that are supported by the different installation modes. A check mark (✓) indicates that the feature that is named in the first row is supported by the installation mode that is listed in the column header.

Table 1. Feature comparison between installation modes
Feature	Full IBM watsonx.ai service	watsonx.ai lightweight engine
Prompt Lab for prompting foundation models and creating prompt templates	✓
Tuning Studio for prompt tuning and fine tuning foundation models	✓
Foundation model hosting	✓	✓
Foundation model inferencing with the text generation API	✓	✓
Apply the hate, abuse, and profanity (HAP) filter to inference requests with AI guardrails	✓	✓
Custom foundation model inferencing with the text generation API	✓	✓
Vectorizing text with the text embeddings API	✓	✓
IBM Software Hub Platform UI	✓	✓
Foundational services (licensing, auditing, and logging)	✓	✓
Command-line installer, backup and restore, user authentication	✓	✓
Common core services (projects, deployment spaces)	✓
Access control	✓
Use Python SDK and sample Python notebooks	✓	✓
Watson Studio, including Runtime, and Watson Machine Learning resources	✓
Compatible with data science and machine learning tools that are installed separately	✓
Evaluate and track prompt templates with IBM watsonx.governance	✓
Use a detached prompt template in a separate IBM watsonx.governance deployment to evaluate foundation models hosted in watsonx.ai lightweight engine.		✓

What to do next

Install the full IBM watsonx.ai service to experiment, build, tune, and test your generative AI solution. When you're ready to deploy one or more dedicated foundation or embedding models for use by your enterprise, install the watsonx.ai lightweight engine to power your solution. Install the watsonx.ai lightweight engine to host and inference AI models from an on-premises installation with a smaller footprint.

The steps that you follow to install both modes are the same with one exception. For the lightweight engine mode, you specify an extra installation option. You set the lite_install property to true when you run the installation command. For details, see Installing IBM watsonx.ai.