FAQ

Edit online

Frequently asked questions about AI Services

What is AI Services?
IBM’s AI Services, part of the Open-Source AI Foundation for Power, deliver pre-built, optimized solutions for rapid deployment of LLMs and advanced inferencing on IBM Spyre™ infrastructure—helping enterprises scale AI seamlessly and accelerate digital transformation.
Who can benefit from using AI Services?
Enterprises, developers, and organizations using IBM Power can benefit from AI Services to scale AI workloads, accelerate digital transformation, and deploy advanced models quickly.
What are the key features of the AI Services?
IBM’s AI Services are pre-built, self-contained units designed for rapid deployment and easy installation, enabling enterprises to accelerate digital transformation. Optimized for IBM Spyre™ on Power infrastructure, these services integrate seamlessly with inferencing solutions like Red Hat AI Inference Server and deliver advanced inferencing capabilities. They support a wide range of AI models, including large language models (LLMs), embedding models, and re-ranker models, while offering flexible consumption options for scalable LLM inferencing. Built for simplicity, speed, and performance, AI Services help businesses streamline processes and boost productivity with minimal complexity.
What subscriptions are required to run AI Services?
Subscriptions required are - RHAIIS and RHEL for IBM Power Little Endian.
How to obtain RHAIIS subscription?
You obtain RHAIIS subscription under a Red Hat subscription agreement. To download and use RHAIIS container images from registry.redhat.io, you must have a valid Red Hat subscription. Log into registry.redhat.io by running - podman login registry.redhat.io - in your terminal.
How to obtain RHEL for IBM Power Little Endian subscription?
Refer to RH official documentation for detailed instructions.
How do I get started with AI Services?
To set up AI Services, simply refer to the installation guide.
How many Spyre cards are required to run AI Services?
Depending on the application, number of required spyre card may change. For the RAG application, we use 4 cards for Granite instruct and 1 card for Re-ranker model.
What does “NUMA-aligned CPUs” mean?
“NUMA-aligned CPUs” means the CPUs are chosen from the same NUMA node, so they share local memory. This avoids cross-node memory access, reducing latency and improving performance for multi-socket systems. Check the NUMA alignment by running - lscpu | grep NUMA - in your terminal.
Which file formats are supported for processing?
Currently only PDF files can be processed.
Can AI Services be used in an air-gapped environment?
Yes, AI Services can be used in air-gapped environment. Refer to the air-gapped installation tutorial.
Are there any limitations to using AI Services?
There are certain limitations to the current architecture and design of AI Services. Refer to limitations to know more.
Where can I report a bug or request support?
You can report a bug, request a feature, or get support by creating an issue here.