Preparing to install Data Virtualization

Plan and prepare to install the Data Virtualization service.

Data Virtualization pods

The Data Virtualization service instance has two main pods:

Head pods

The head pod is c-db2u-dv-db2u-0.

The Data Virtualization head pod runs the Data Virtualization head component (also known as the engine).
Worker pods

Worker pods are c-db2u-dv-db2u-x where x is 1 or greater.

In Data Virtualization, the term worker pod refers to a pod that runs one Data Virtualization worker component. You can allocate multiple worker components to the Data Virtualization service instance. These components are effectively multiple c-db2u-dv-db2u-x pods.

You must not confuse Data Virtualization worker pods with compute nodes, which are the physical nodes that make up the Red Hat® OpenShift® cluster. For more information about cluster compute nodes, see Architecture for IBM Software Hub.

System requirements

Ensure that you meet the service requirements that are listed in System requirements. The Data Virtualization service runs on x86_64 hardware only.

The Data Virtualization service is provisioned to any compute node in the IBM Software Hub cluster that has the specified resources (cores, memory, and ephemeral storage) available.

Additionally, the IBM Software Hub cluster must accommodate the initial provisioning request for Data Virtualization service pods.

Important: If you try to provision a Data Virtualization service instance and you don't have enough resources, the provisioning fails.

Scaling

You can scale the Data Virtualization service up and down at any time after you provision it. For more information, see Scaling Data Virtualization.

Work with IBM Sales to get a more accurate sizing based on your expected workload.

IBM Sales helps you estimate the total demand for Data Virtualization. The service then redistributes resources internally when you provision Data Virtualization.

Storage requirements

Ensure that you meet the storage requirements that are listed in Storage requirements.

At a minimum, the persistent storage must meet the following requirements for Data Virtualization.
  • Persistent volume for Data Virtualization engine node is 50Gi.
  • Persistent volume for Data Virtualization caching is 100Gi.
  • Persistent volume is XFS formatted.

If you are using an NFS storage class, you must ensure that the NFS export is set to no_root_squash before you install. For more information, see Setting up NFS storage. To troubleshoot issues, see SQL6031N error in the db2nodes.cfg file in Data Virtualization.

External libraries

External libraries (that is, libraries that are not included in the Data Virtualization service) are stored on a persistent volume. Data Virtualization automatically creates persistent volume claims during the provisioning process.

This storage is the same as the persistent volume for the Data Virtualization head pod.

The persistent volume claim for external libraries must have at least 50 GB available.

Cache storage
A data cache holds temporary data that is used frequently. By using a data cache, you can reduce processing and loading time when you use this data.
Audit log storage
The Db2audit facility is enabled by default when you install Data Virtualization. You set the file storage class parameters for storing the audit logs.

Kernel parameter settings

To ensure that Data Virtualization can run correctly, you must verify the kernel parameters. For more information, see Changing kernel parameter settings.

CRI-O container settings

Complete the steps in Changing CRI-O container settings to set the pids_limit parameter.

Ensure that the pids_limit parameter is equal to or greater than the minimum value that is a prerequisite for IBM Software Hub.