Installing IBM Cloud Pak for Data
A Red Hat® OpenShift® Container Platform cluster administrator and project administrator can work together to prepare the cluster and install IBM® Cloud Pak for Data.
Before you begin
- Ensure that you review the following information before you install Cloud Pak for Data:
- Planning
- System
requirements
Ensure that you install the software on a system that has sufficient resources and that aligns with the guidance in the System requirements. For example, if you do not follow the specified disk requirements, you can run into out of memory errors.
- Determine which services you want to install.
Some of the pre-installation tasks, such as creating catalog source and operator subscriptions, include steps for the services as well as the Cloud Pak for Data platform. If you know which services you plan to install, you can streamline your installation by batching these tasks.
- Use the following information to ensure that you complete the appropriate tasks for your environment.
1. Do you have an existing Red Hat OpenShift Container Platform cluster?
Cloud Pak for Data is installed on a Red Hat OpenShift Container Platform Version 4.6 or Version 4.8 cluster.
Options | What to do |
---|---|
You already have an OpenShift 4.6 or 4.8 cluster | |
You have an older version of OpenShift |
|
You don't have an OpenShift cluster |
|
2. Where do you want to host your Cloud Pak for Data installation?
You can deploy Cloud Pak for Data on-premises or on the cloud. Your deployment environment determines how you can install Red Hat OpenShift Container Platform:
Options | What to do |
---|---|
You want to deploy Cloud Pak for Data on-premises |
|
You want to deploy Cloud Pak for Data on cloud |
|
3. Do you already have supported persistent storage on your cluster?
- Red Hat OpenShift Container Storage
- Version: 4.6 or later fixes
- IBM Spectrum® Fusion
- Version: 2.1.2 or later fixes
- IBM Spectrum Scale Container Native storage
- IBM Spectrum Scale Container Native Storage Access Version: 5.1.1.3 or later fixes
- Network File System (NFS)
- Version: 4
- Portworx
- Version: 2.7.0 or later fixes
- IBM Cloud File Storage
- Version: Not applicable
Ensure that you have storage that works with the services that you plan to install.
Options | What to do |
---|---|
You have the supported storage |
|
You don't have supported storage |
|
4. Do you have the required OpenShift projects on your cluster?
- Separate the IBM Cloud Pak® foundational services operators from the Cloud Pak for Data operators
- Install multiple instances of Cloud Pak for Data on the cluster
Options | What to do |
---|---|
You know which projects you plan to use when you install the software |
|
You don't know which projects you plan to use when you install the software |
|
5. Do you have your API key?
The Cloud Pak for Data software images are hosted on the IBM Entitled Registry. To access the images, you must have your IBM entitlement API key.
Options | What to do |
---|---|
You have your API key | |
You don't have your API key |
|
6. How are you going to access the required software images?
Cloud Pak for Data images are accessible from the IBM Entitled Registry. In most situations, it is strongly recommended that you mirror the necessary software images from the IBM Entitled Registry to a private container registry.
The only situation in which you might consider pulling images directly from the IBM Entitled Registry is when your cluster is not air-gapped, your network is extremely reliable, and latency is not a concern. However, for predictable and reliable performance, you should mirror the images to a private container registry.
- Your cluster is air-gapped (also called an offline or disconnected cluster)
- Your cluster uses an allowlist to permit direct access by specific sites and the allowlist does not include the IBM Entitled Registry
- Your cluster uses a blocklist to prevent direct access by specific sites and the blocklist includes the IBM Entitled Registry
Options | What to do |
---|---|
You are pulling images from the IBM Entitled Registry | |
You are mirroring images to a private container registry |
|
7. Configuring your cluster to pull software images
You must ensure that your cluster is configured to pull the software images from the appropriate location.
Options | What to do |
---|---|
You are pulling images from the IBM Entitled Registry |
|
You are pulling images from a private container registry |
|
8. Creating catalog sources
You must create catalog sources to ensure that your cluster uses the correct software images for your environment.
Options | What to do |
---|---|
You are pulling images from the IBM Entitled Registry |
|
You are pulling images from a private container registry |
9. Are the IBM Cloud Pak foundational services already installed on your cluster?
The IBM Cloud Pak foundational services are a prerequisite for Cloud Pak for Data. However, in some situations the IBM Cloud Pak for Data platform operator can automatically install the IBM Cloud Pak foundational services operators and services on the cluster.
For information about supported versions of IBM Cloud Pak foundational services, see the Cloud Pak for Data platform software requirements.
Options | What to do |
---|---|
IBM Cloud Pak foundational services Version 3.18.0 or later is already installed | |
An earlier version of IBM Cloud Pak foundational services is installed |
|
IBM Cloud Pak foundational services is not installed and you are using the express installation method | With the express installation method, all of the operators are in the same OpenShift project and the IBM Cloud Pak for Data platform operator can automatically install IBM Cloud Pak foundational services. |
IBM Cloud Pak foundational services is not installed and you are using the specialized installation method | With the specialized installation method, the IBM Cloud Pak
foundational services operators and the Cloud Pak for Data operators are in separate OpenShift projects. To ensure IBM Cloud Pak
foundational services is installed in the correct
project, you must manually install it.
|
10. Creating operator subscriptions
An operator subscription tells the cluster where to install a given operator and gives information about the operator to Operator Lifecycle Manager (OLM).
- Complete the appropriate steps for your environment in Creating operator subscriptions.
- Go to 11. Do you plan to install services that require custom SCCs?
11. Do you plan to install services that require custom SCCs?
- Data Virtualization
- Db2®
- Db2 Big SQL
- Db2 Warehouse
- OpenPages®
- Watson™ Knowledge Catalog
Options | What to do |
---|---|
You plan to install one or more of these services |
|
You don't plan to install any of these services |
12. Do you plan to install services that require specific node settings?
- Data Virtualization
- Db2
- Db2 Big SQL
- Db2 Warehouse
- Jupyter Notebooks with Python 3.7 for GPU
- OpenPages
- Watson Discovery
- Watson Knowledge Catalog
- Watson Machine Learning Accelerator
- Watson Studio
You might also need to adjust some node settings if you are working with large data sets or you have slower network speeds.
Options | What to do |
---|---|
You plan to install one or more of these services, you have large data sets, or you have slower network speeds |
|
You don't plan to install any of these services |
13. Do you need to install the scheduling service?
The scheduling service is required if you plan to install Watson Machine Learning Accelerator.
However, it is strongly recommended that you install the scheduling service so that you can programmatically enforce the quotas that you set on the platform and on individual services.
Options | What to do |
---|---|
You need to install the scheduling service |
|
You don't plan to install the scheduling service |
14. Installing Cloud Pak for Data
15. Completing post-installation tasks
- Complete the appropriate tasks for your environment in Post-installation tasks.
- Go to 16. Installing services.
16. Installing services
- Instructions for installing IBM services are available in Services.