Provisioning the service (Data Virtualization)
Before you use Data Virtualization, you must provision an instance of the service to your IBM Cloud Pak for Data.
Before you begin
About this task
The Data Virtualization service is provisioned to any compute node in the Cloud Pak for Data cluster that has the specified resources (cores and memory) available.
To provision the Data Virtualization service:
- From the main menu, click the .
- From the list of instances, locate the Data Virtualization service, click the action menu, and select Provision instance.
- If you manually set the kernel semaphore parameter, check the You must check this box if you updated the kernel semaphore
parameter box. If the Linux® Kernel version on the cluster nodes is less than 4.6, you must update the kernel semaphore parameter. For details, see Preparing to install the service.
If you manually update the kernel semaphore parameter and you do not check the corresponding box, the Data Virtualization service provisioning will fail.
To configure the service, specify the resources that you want to allocate to the Data Virtualization worker nodes in the
You can scale the Data Virtualization service up and down at any time after you provision it. For details, see Scaling services.
- Specify the number of Data Virtualization
worker nodes to allocate to the service. Recommended: One worker node is sufficient for many workloads.
To understand the difference between compute nodes and worker nodes, see Preparing to install the service.
- Specify the number of cores to allocate per node. You are constrained by the total number of available cores on the OpenShift® compute nodes.
- Specify the amount of memory in GB to allocate per node. You are constrained by the total amount of memory on the OpenShift compute nodes.
- Specify the number of Data Virtualization worker nodes to allocate to the service.
- In the Storage step, specify the storage classes and
persistent volume sizes that you want to use for the service nodes and caching storage.
If you use Portworx for your storage class, select
portworx-dv-shared-gp3for the Storage class option. For more information, see Storage considerations.
- In the Head storage section, select the storage class and
specify the amount of storage to allocate to the head node.
In Data Virtualization, a Data Virtualization head node corresponds to a
dv-enginepod that runs on your Red Hat® OpenShift cluster.
- In the Worker storage section, select the storage class and
specify the amount of storage to allocate to your worker nodes.
The term worker node in Data Virtualization refers to the worker service component that runs on each
dv-workerpod. You can allocate multiple worker nodes, which are effectively multiple
dv-workerpods, to the Data Virtualization service instance.
- In the Caching storage section, select the storage class and
specify the amount of storage to allocate to your data caches Note: Part of the total cache storage space is used for refreshing active caches that have a periodic refresh schedule. This refresh schedule impacts the storage space that is available for creating new cache entries.
- In the Head storage section, select the storage class and specify the amount of storage to allocate to the head node.
- Click Next.
- Ensure that the summary is correct and click
Configure. Wait for the service to be provisioned.
- Optional: If you want to use Cloud Pak for Data while you wait for the Data Virtualization provisioning process to complete, click Home.
What to do next
Checking for available patches
Determine whether there are any patches available for the version of Data Virtualization that you installed:
- Clusters connected to the internet
- Run the following command to check for
./cpd-cli status \ --repo ./repo.yaml \ --namespace Project \ --assembly dv \ --patches \ --available-updates
- Air-gapped clusters
- See the list of Available patches for Data Virtualization.
If you need to apply patches to the service, follow the guidance in Applying patches.
- When you provision the Data Virtualization service you are automatically assigned the Data Virtualization Admin role. After you provision the service, you must give other users access to the service. For more information, see Managing users in Data Virtualization.
- To connect to the Data Virtualization service, use the JDBC URL that is provided in the Connection details page for the service. Additionally, if you have a load balancer, you must open the port in your load balancer and your firewall. For more information, see Network requirements for Data Virtualization.