Choosing an IBM watsonx.ai installation mode
Installation modes
- Full service
- Installs the full IBM watsonx.ai service, which is a studio of integrated tools for experimenting, building, tuning, and evaluating generative AI solutions securely.
- Lightweight engine
- Installs the watsonx.ai lightweight engine, which is a service for inferencing foundation models programmatically.
Architecture comparison
- Load balancer: 1
- Control plane nodes: 3
- Worker nodes: At least 3 plus any nodes that are required to host installed large language models, where each worker node has a minimum of 8 CPU cores. The number of required extra nodes is sized per foundation model.
The following diagram describes the components that comprise the full IBM
watsonx.ai
service versus the watsonx.ai lightweight engine.
- Red Hat® OpenShift® AI platform layer that supports hosting and inferencing AI models.
- Cloud Pak for Data platform which provides capabilities such as a command-line installer, auditing, licensing, logging, and more.
- Generative AI APIs for inferencing foundation models and vectoring text programmatically.
- AI Guardrails for filtering personally-identifiable information and hate, abuse, and profanity.
- Common core services that help you manage your work and data with projects, deployment spaces, job management, and more.
- Generative AI tools such as the Prompt Lab and Tuning Studio.
- Watson Studio and Watson Machine Learning services for tuning and deploying foundation models, deploying prompt templates, building and running Python notebooks, and more.
- Integration with IBM watsonx.governance.
Feature comparison
Feature | Full IBM watsonx.ai service | watsonx.ai lightweight engine |
---|---|---|
Prompt Lab for prompting foundation models and creating prompt templates | ✓ | |
Tuning Studio for prompt tuning foundation models | ✓ | |
Foundation model hosting | ✓ | ✓ |
Foundation model inferencing with the text generation API | ✓ | ✓ |
Apply the hate, abuse, and profanity (HAP) filter to inference requests with AI guardrails | ✓ | ✓ |
Custom foundation model inferencing with the text generation API | ✓ | ✓ |
Vectorizing text with the text embeddings API | ✓ | ✓ |
Cloud Pak Platform UI | ✓ | ✓ |
Foundational services (licensing, auditing, and logging) | ✓ | ✓ |
Command-line installer, backup and restore, user authentication | ✓ | ✓ |
Common core services (projects, deployment spaces) | ✓ | |
Access control | ✓ | |
Use Python SDK and sample Python notebooks | ✓ | ✓ |
Watson Studio, including Runtime, and Watson Machine Learning resources | ✓ | |
Compatible with data science and machine learning tools that are installed separately | ✓ | |
Evaluate and track prompt templates with IBM watsonx.governance | ✓ | ✓ |
What to do next
Install the full IBM watsonx.ai service to experiment, build, tune, and test your generative AI solution. When you're ready to deploy one or more dedicated foundation or embedding models for use by your enterprise, install the watsonx.ai lightweight engine to power your solution. Install the watsonx.ai lightweight engine to host and inference AI models from an on-premises installation with a smaller footprint.
The steps that you follow to install both modes are the same with one exception. For lightweight
engine mode, you specify an extra installation option. You set the lite_install
property to true
when you run the installation command. For details, see Installing IBM watsonx.ai.