Choosing an IBM watsonx.ai installation mode

Depending on the size and scale of your deployment, you can install the full IBM watsonx.ai service or the watsonx.ai™ lightweight engine.

Installation modes

You can install IBM watsonx.ai in one of the following modes:
Full service
Installs the full IBM watsonx.ai service, which is a studio of integrated tools for experimenting, building, tuning, and evaluating generative AI solutions securely.
Lightweight engine
Installs the watsonx.ai lightweight engine, which is a service for inferencing foundation models programmatically.

Architecture comparison

The recommended watsonx.ai lightweight engine installation follows the standard platform architecture that includes the following components for high availability:
  • Load balancer: 1
  • Control plane nodes: 3
  • Worker nodes: At least 3 plus any nodes that are required to host installed large language models, where each worker node has a minimum of 8 CPU cores. The number of required extra nodes is sized per foundation model.

You can also install the watsonx.ai in the full service or lightweight engine mode on a single node OpenShift® (SNO) cluster if high availability is not a requirement for the deployment.

The following diagram describes the components that comprise the full IBM watsonx.ai service versus the watsonx.ai lightweight engine. Shows the components that are included in watsonx.ai depending on installation mode. The same information is included in the page text after the image.

As shown in the diagram, both installation modes include the following components:
  • Red Hat® OpenShift AI platform layer that supports hosting and inferencing foundation models.
  • IBM Software Hub platform which provides capabilities such as a command-line installer, auditing, licensing, logging, and more.
  • Generative AI APIs for inferencing foundation models and vectoring text programmatically.
  • AI Guardrails for filtering personally-identifiable information and hate, abuse, and profanity.
A full IBM watsonx.ai service installation includes the following extra components:
  • Common core services that help you manage your work and data with projects, deployment spaces, job management, and more.
  • Generative AI tools such as the Prompt Lab and Tuning Studio.
  • Watson Studio and Watson Machine Learning services for tuning and deploying foundation models, deploying prompt templates, building and running Python notebooks, and more.
  • Integration with IBM watsonx.governance™.

Feature comparison

The following table shows features that are supported by the different installation modes. A check mark (✓) indicates that the feature that is named in the first row is supported by the installation mode that is listed in the column header.
Table 1. Feature comparison between installation modes
Feature Full IBM watsonx.ai service watsonx.ai lightweight engine
Prompt Lab for prompting foundation models and creating prompt templates  
Tuning Studio for prompt tuning and fine tuning foundation models  
Foundation model hosting
Foundation model inferencing with the text generation API
Apply the hate, abuse, and profanity (HAP) filter to inference requests with AI guardrails
Custom foundation model inferencing with the text generation API
Vectorizing text with the text embeddings API
IBM Software Hub Platform UI
Foundational services (licensing, auditing, and logging)
Command-line installer, backup and restore, user authentication
Common core services (projects, deployment spaces)  
Access control  
Use Python SDK and sample Python notebooks
Watson Studio, including Runtime, and Watson Machine Learning resources  
Compatible with data science and machine learning tools that are installed separately  
Evaluate and track prompt templates with IBM watsonx.governance  
Use a detached prompt template in a separate IBM watsonx.governance deployment to evaluate foundation models hosted in watsonx.ai lightweight engine.  

What to do next

Install the full IBM watsonx.ai service to experiment, build, tune, and test your generative AI solution. When you're ready to deploy one or more dedicated foundation or embedding models for use by your enterprise, install the watsonx.ai lightweight engine to power your solution. Install the watsonx.ai lightweight engine to host and inference AI models from an on-premises installation with a smaller footprint.

The steps that you follow to install both modes are the same with one exception. For the lightweight engine mode, you specify an extra installation option. You set the lite_install property to true when you run the installation command. For details, see Installing IBM watsonx.ai.