Release Notes
Abstract
Db2 Warehouse 11.5.9 BETA
Content
The Db2 Warehouse 11.5.9 Beta release supports the use of DATALAKE tables, providing users with access to data that is stored in open data formats like PARQUET and ORC.
For more information, see https://www.ibm.com/docs/en/db2woc?topic=using-datalake-tables
The Db2 Warehouse 11.5.9 Beta release also supports Native Cloud Object Storage, allowing users to store in object storage traditional Db2 column-organized tables in Db2's native format while maintaining the existing SQL support and performance using a tiered storage architecture.
For more information, see https://www.ibm.com/docs/en/db2woc?topic=native-cloud-object-storage-support
How to install Db2U 11.5.9 Standalone Beta
Before you Begin
-
Install the Db2 Operator using the IBM Cloud Pak CLI tool.
-
Install the ibm-pak tool.
Procedure:
WARNING:
Installing the 11.5.9 beta on a cluster which already has Db2U Standalone installed will overwrite the existing catalogsource in the openshift-marketplace. Both operators should function fine as long as they're installed in different namespaces. The catalog must be installed in openshift-marketplace
WARNING:
It's recommended to install the 11.5.9 operator in a separate namespace from any existing db2u-operator install.
Setup some environment variables that can be used:
export CASE_VERSION="5.4.0+20230815.111832.10188.6946"
export CASE_CACHE_IMAGE_DIGEST="sha256:d1ff1a4ac6d1a135ad2378f9116fba97034865ac60e4f035602dbea576da07fe"
Use ibm-pak to pull the CASE locally:
oc ibm-pak get docker://icr.io/db2u/ibm-db2uoperator-case-cache@${CASE_CACHE_IMAGE_DIGEST}
This should create a directory in ibm-pak's home directory, this is typically:
${HOME}/.ibm-pak/data/cases/<CASE_NAME>/<CASE_VERSION>
ie:
/root/.ibm-pak/data/cases/ibm-db2uoperator/5.4.0+20230815.111832.10188.6946
Go to this newly created directory, you should see the CASE bundle and images csv have been downloaded:
# cd ${HOME}/.ibm-pak/data/cases/ibm-db2uoperator/${CASE_VERSION}
# ls -latr
total 332
-rw-r--r-- 1 root root 78 Aug 24 07:40 caseDependencyMapping.csv
-rw-r--r-- 1 root root 32 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-charts.csv
-rw-r--r-- 1 root root 3896 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-airgap-metadata.yaml
drwxr-xr-x 2 root root 6 Aug 24 07:40 charts
drwxr-xr-x 2 root root 50 Aug 24 07:40 resourceIndexes
-rw-r--r-- 1 root root 310896 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946.tgz
-rw-r--r-- 1 root root 4439 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-images.csv
drwx------ 4 root root 4096 Aug 24 07:40 .
drwxr-xr-x 4 root root 90 Aug 24 07:40 ..
-rw-r--r-- 1 root root 276 Aug 24 07:40 component-set-config.yaml
Note: If you are installing the Db2 Operator in an air-gapped environment, you'll have to copy the contents of ${HOME}/.ibm-pak/data/cases/ibm-db2uoperator/${CASE_VERSION} directory to your offline directory before mirroring the images
Now we can install the catalog and operator.
1. Install the Db2 catalog:
oc ibm-pak launch \
--case ibm-db2uoperator-${CASE_VERSION}.tgz \
--namespace openshift-marketplace \
--inventory db2uOperatorSetup \
--action installCatalog \
--tolerance=1
2. Install the Db2 operator
oc ibm-pak launch \
--case ibm-db2uoperator-${CASE_VERSION}..tgz \
--namespace ${NS} \
--inventory db2uOperatorWhOnlySetup \
--action installOperator \
--tolerance=1
Deploying Db2 Warehouse with the Db2uInstance Custom Resource
The Db2uInstance Custom Resource (CR) provides the interface required to deploy Db2 Warehouse and enable support for DATALAKE tables and Native Cloud Object Storage. For more information, see https://www.ibm.com/docs/en/db2-warehouse?topic=resource-deploying-db2-warehouse-using-db2uinstance-custom
Enabling and disabling support for DATALAKE tables
The addOns.opendataformats.enabled Db2uInstance CR variable is used to enable or disable DATALAKE table support. The default value is true, as shown in the example below. Setting this value to false disables DATALAKE table support.
apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
name: db2wh-mpp
spec:
version: s11.5.9.0
nodes: 2
addOns:
opendataformats:
enabled: true
Configuring memory distribution for DATALAKE tables
Memory distribution for DATALAKE tables can be set through the workloadProfile parameter. Two options are available:
-
The default option is intended for a workload where most of query access is for native Db2 tables.
-
The balanced option gives more memory to DATALAKE tables and less memory to Db2 tables. This option handles workloads with larger data sets and concurrency requirements on DATALAKE access.
The following example shows the command syntax for setting a balanced memory distribution:
spec:
version: s11.5.9.0
nodes: 2
addOns:
opendataformats:
enabled: true
workloadProfile: balanced
Enabling support for Native Cloud Object Storage
The following example shows how to enable support for Native Cloud Object storage through the Db2uInstance CR. In that example, you can see that there are three new elements:
-
The advanced option advOpts.enableCos: true is the toggle to enable the functionality.
-
The advanced option advOpts.cosProvider: “<cos-provider>” is specified to indicate the target Cloud Object Storage service.
-
A new storage section “cachingtier” for the Local Storage for the Caching Tier is completed, as it is a mandatory component in order to enable this support.
The requirements for the configuration of both the COS Provider and the Caching Tier storage are discussed in more detail in the following sections:
apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
name: db2wh-mpp
spec:
version: s11.5.9.0
...
advOpts:
enableCos: "true"
cosProvider: "aws"
...
storage:
...
- name: cachingtier
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 700Gi
storageClassName: local-device
type: template
...
Configuring Instance memory and CPU consumption
The following example shows the command syntax for setting the CPU limits:
spec:
podTemplate:
db2u:
resource:
db2u:
limits:
cpu: 8
memory: 24Gi
Configuring default database settings
Do not override the following default database settings:
-
dftTableOrg: "COLUMN"
-
dftPageSize: "32768"
-
encrypt: "YES"
-
codeset: "UTF-8"
-
territory: "US"
-
collation: "IDENTITY"
Note: Oracle compatibility features are not supported for DATALAKE tables.
These default values are set even when they are not specified in the YAML file. Use these default values for database configuration parameters and registry variables. If no values are specified in your YAML code, the default values are used.
The following example shows the command syntax for setting the following values:
-
The database name.
-
Any database settings that should not be altered.
-
The database configuration.
-
Enabling LDAP authentication.
spec:
environment:
dbType: db2wh
databases:
- name: BLUDB
settings:
dftTableOrg: "COLUMN"
dftPageSize: "32768"
dbConfig:
LOGPRIMARY: "20"
LOGSECOND: "30"
authentication:
ldap:
enabled: true
Setting up database partitions
This example provides us a Db2 Warehouse MPP instance of 2 nodes with 4 multiple logical nodes.
spec:
version: s11.5.9.0
nodes: 2
environment:
partitionConfig:
total: 4
Note: spec.environment.partitionConfig.volumePerPartition is set to ‘true’ by default if dbType is ‘db2wh’.
Example of a complete Db2uInstance CR
The CR below creates a Db2 Warehouse instance with the following configuration:
-
Database name: BLUDB.
-
14 CPUs.
-
56 Gi of memory.
-
2 nodes with 4 multiple logical nodes.
-
Open Data Formats enabled
-
Native COS enabled
-
AWS S3 as object storage provider
-
Default memory configuration
-
LDAP enabled
-
6 storage volumes (meta, archive logs, data, cachingtier, tempts, etcd and blulocal).
apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
name: db2wh-mpp
spec:
version: s11.5.9.0
nodes: 2
addOns:
opendataformats:
enabled: true
advOpts:
enableCos: "true"
cosProvider: "aws"
podTemplate:
db2u:
resource:
db2u:
limits:
cpu: 14
memory: 56Gi
environment:
dbType: db2wh
databases:
- name: BLUDB
partitionConfig:
total: 4
volumePerPartition: true
authentication:
ldap:
enabled: true
license:
accept: true
storage:
- name: meta
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: ocs-storagecluster-cephfs
type: create
- name: archivelogs
type: create
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 25Gi
storageClassName: ocs-storagecluster-cephfs
- name: data
type: template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 300Gi
storageClassName: ocs-storagecluster-ceph-rbd
- name: cachingtier
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: ocs-storagecluster-ceph-rbd
type: template
- name: tempts
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ocs-storagecluster-ceph-rbd
type: template
- name: etcd
type: template
spec:
storageClassName: ocs-storagecluster-ceph-rbd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- name: blulocal
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 70Gi
storageClassName: ocs-storagecluster-ceph-rbd
type: template
Connecting DATALAKE tables to your Db2 Warehouse instance
The user must use db2u-head-engn-svc service to always point to catalog node and not use db2u-engn-svc which can redirect to any engine pod.
In the example below, port 30920 can be used to connect to the db2 instance if we want to connect from outside the cluster. To connect from within the cluster, the service name and its internal port is used instead: c-entrepot-db2u-head-engn-svc:50000.
# oc get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
c-entrepot-db2u ClusterIP 172.30.90.235 <none> 50000/TCP,50001/TCP,25000/TCP,25001/TCP,25002/TCP,25003/TCP,25004/TCP,25005/TCP 2d15h
c-entrepot-db2u-engn-svc NodePort 172.30.68.205 <none> 50000:30956/TCP,50001:30288/TCP 2d15h
c-entrepot-db2u-head-engn-svc NodePort 172.30.254.111 <none> 50000:30920/TCP,50001:30604/TCP 2d15h
Supported Storage Options For Database Storage
|
Storage Provider |
Deployment Requirements |
|
OpenShift® Data Foundation 4.10 - 4.12 For more information, see the OpenShift documentation. |
Split storage: System/backup data
User data
|
|
NFS V3 |
Refer to the Native Cloud Object Storage section for the requirements for configuring Local Storage for the Caching Tier.
Kubernetes providers
Red Hat OpenShift Container Platform
DATALAKE Tables Support
Please refer to the following document for information related to using DATALAKE tables
https://www.ibm.com/docs/en/db2woc?topic=using-datalake-tables
Supported Object Stores
The following object storage services that are compatible with the Amazon S3 API are supported:
-
Ceph: Note that Ceph is not supported with SSL enabled for DATALAKE tables.
-
AWS S3
-
IBM Cloud Object Storage
Supported DATALAKE Table Types
|
File Formats |
DATALAKE Table |
DATALAKE Iceberg Table |
|
Parquet |
Yes |
Yes |
|
Optimized Row Columnar (ORC) |
Yes |
Yes |
|
Text file |
Yes |
Not supported by Iceberg |
Setting up storage alias for DATALAKE table
Before any user can create or access a DATALAKE table, a storage access alias must be created by a database administrator to establish connectivity to the remote storage where the table resides.
About this task
When the storage access alias is created, the remote storage account credentials are safely stored in an encrypted keystore.
In addition to storing the credentials for storage connectivity, the alias can optionally include a location in the bucket where table will be created.
The usage of the storage alias to create a table is subject to access control and can be granted to a single user, a user group or a database role.
Before you begin
You must have database administration authority (DBADM) to create a storage alias.
A bucket must be provisioned on one of the supported Object Storage services before the storage alias is created.
The following information must be available to create the storage alias:
-
The name of the bucket
-
The API endpoint used to access the bucket
-
The credentials to access the bucket. The credentials must have read AND write access to allow table creation
Procedure:
Create a storage alias with the following parameters to access the DATALAKE tables:
alias-name
Specifies the new storage alias name.
vendor
Specifies the type of the remote storage. For accessing DATALAKE tables, only ‘S3’ is supported.
server
Specifies the remote storage endpoint to access the bucket.
user
Specifies the S3 access key of the remote storage account.
password
Specifies the S3 secret key of the remote storage account.
container
Specifies the name of the S3 bucket where the tables will be created or accessed.
object
[Optional] Specifies a parent directory in the bucket for the DATALAKE table. If a path is specified for the object parameter, the DATALAKE table will have to be located in a subdirectory of that path.
grantee-type
Specifies the type of grantee. Use ‘U’ for a user ID, ‘G’ for a group name and ‘R’ for a database role name.
grantee
The user, group, or role granted usage access of the storage alias. Only users with this access will be allowed to use the storage alias to create a DATALAKE table.
CALL SYSIBMADM.STORAGE_ACCESS_ALIAS.CATALOG('alias-name', 'vendor', 'server', 'user', 'password', 'container', 'object', 'grantee-type', 'grantee')
NOTE: Since Ceph is not supported with SSL enabled, the following command need to be executed before creating a datalake table when using Ceph object store:
CALL SYSHADOOP.SET_DATALAKE_CONFIG('CORE', 'fs.s3a.bucket.<bucket_name>.connection.ssl.enabled', 'false')
Creating a DATALAKE table with storage alias
Once a storage alias has been created, a user who has been granted access to the storage alias use it to create a DATALAKE table. When creating a DATALAKE table, a LOCATION clause must be specified. The syntax of the LOCATION clause must be as follows:
DB2REMOTE://alias-name//object/path
alias-name
Must be the name of an alias to which the user that is creating the table has access. The alias contains the information required to access the container where the table data is to reside.
object
Must be the object specified when creating the storage alias. It represents a path on the container. If no object was specified when creating the storage alias, this part can remain empty.
path
An optional additional path. Appended to the object path, it represents the location of the table data in the container.
For more information, see the CREATE DATALAKE TABLE statement.
For information on Security considerations for DATALAKE tables, see https://www.ibm.com/docs/en/db2woc?topic=tables-remote-storage-connectivity#db2woc_dl_storage_conn__title__5
Accessing watsonx.data on Cloud
For information on accessing watsonx.data, see Accessing watsonx.data
Configuring Db2 for watsonx.data integration
To configure integration between Db2 and watsonx.data, see https://www.ibm.com/docs/en/db2woc?topic=watsonxdata-configuring-db2-integration
Importing tables from watsonx.data
After registering a watsonx.data metastore in Db2, it is possible to import Iceberg tables from watsonx.data into Db2. The table is then accessible from both systems.
To import a table created in a watsonx.data metastore into Db2, you can run the following statement:
CALL EXTERNAL_CATALOG_SYNC('metastore-name', 'schema-name', 'table-name',
'exist-action', 'error-action', NULL)
For example, to import all tables in schema icebergdemo, use:
CALL EXTERNAL_CATALOG_SYNC('watsonxdata', 'icebergdemo', '.*', 'REPLACE', 'STOP', '')
For more information, see Importing tables from watsonx.data
Note: SELECT or INSERT might fail after calling EXTERNAL_CATALOG_SYNC. You might receive the following error:
SQL5105N The statement failed because a Big SQL component encountered an
error. Component receiving the error: "BigSQL IO". Component returning the
error: "UNKNOWN". Log entry identifier: "[BSL-1-1f4690d0b]". Reason: "Table
does not exist: icebergl". SQLSTATE=58040
It is recommended to restart bigsql to resolve this error. Following is an example to restart bigsql from the db2u head pod.
$ wvcli system disable
Wolverine HA management state was disabled successfully.
$ bigsql stop
Stopping Standalone Metastore : OK
Stopping Big SQL : OK
Stopping Big SQL Scheduler : OK
$ bigsql start
Global config update : OK
Starting Big SQL Scheduler : OK
Starting Big SQL : OK
Starting Standalone Metastore : OK
$ wvcli system enable
Wolverine HA management state was enabled successfully.
Check bigsql status and verify that metastore is in listening state before connecting to the database and running the queries.
$ bigsql status
SERVICE HOSTNAME NODE PID STATUS
Big SQL Worker c-entrepot-db2u-1.c-entrepot-db2u-internal 2 106073 DB2 Running
Big SQL Worker c-entrepot-db2u-1.c-entrepot-db2u-internal 3 106072 DB2 Running
Big SQL Master c-entrepot-db2u-0.c-entrepot-db2u-internal 0 2441832 DB2 Running
Big SQL Worker c-entrepot-db2u-0.c-entrepot-db2u-internal 1 2441834 DB2 Running
Standalone Metastore c-entrepot-db2u-0.c-entrepot-db2u-internal - 2442926 Available (listening)
Big SQL Scheduler c-entrepot-db2u-0.c-entrepot-db2u-internal - 2432240 Available
Creating Iceberg tables in watsonx.data
Creating Iceberg tables in watsonx.data
Disconnecting Db2 from watsonx.data
It is possible to disconnect Db2 from watsonx.data by unregistering the watsonx.data metastore from Db2 and deleting the access storage alias or aliases used by Db2 to connect to the watsonx.data object storage container.
Unregistering the watsonx.data metastore
To unregister the watsonx.data metastore in Db2, run the following command:
CALL UNREGISTER_EXT_METASTORE('metastore-name', ?, ?)
with the following parameters:
metastore-name
The name under which the metastore was registered.
After the metastore has been unregistered, the watsonx.data metastore is not accessible and the following actions fail:
-
Running the EXTERNAL_CATALOG_SYNC() stored procedure to import tables or refresh their schema.
-
Issuing a SELECT or INSERT statement from Db2 on an Iceberg table created in watsonx.data.
Important: Due to the nature of non-Iceberg DATALAKE tables, issuing a SELECT or INSERT statement against them continues to be successful. However, SELECT or INSERT statements will not be successful when the access to the underlying storage is revoked after deleting the access storage alias used to connect to the watsonx.data object storage container.
Deleting the access storage alias
To completely disconnect watsonx.data from Db2, you must also delete the access storage alias used to connect to the watsonx.data object storage container.
To delete the storage access alias, see Deleting the storage alias.
Native Cloud Object Storage support
IBM Db2 Warehouse 11.5.9 Beta introduces native object storage support to Db2, allowing users to store in object storage traditional Db2 column-organized tables in Db2's native format while maintaining the existing SQL support and performance using a tiered storage architecture.
Storage Architecture with Native Cloud Object Support
Support for native cloud object storage introduces a new multi-tier storage architecture for table spaces that allows the native storage of data pages for data base objects (for example, column-organized tables) in cloud object storage.
For more information on table spaces, see Table Spaces.
In the diagram below we show the multiple components that are part of the storage architecture of a typical IBM® Db2® Warehouse deployment model, in this example with 2 compute nodes and 4 database partitions on each: the first compute node runs database partitions 0, 1, 2 and 3, and the second compute node runs database partitions 4, 5, 6, and 7. In the diagram below you can see the three levels of storage that are part of this architecture that supports cloud object storage:
-
Reliable durable file storage is associated with each of the Db2 Database Partitions - this is an existing element in the IBM Db2 Warehouse architecture that is maintained in the current generation to enable a simple path towards adoption of the object storage medium for existing databases.
-
A new layer of local storage in fast SSD/NVMe drives that are only associated with the compute node they are attached to, and that as the diagram shows are used as a local cache for the objects in object store. In the diagram we show multiple local disks, one per Db2 Database Partition, just to show that each Db2 Database Partitions manages its own portion of the local cache independently from the others.
-
In the object storage layer, there are objects within the object storage layer that are associated with each of the database partitions.
For more information, see Storage Architecture with Native Cloud Object Support
User experience for Native Cloud Object Storage
The Native Cloud Object Storage support provides a simple path for adoption through its direct integration within the existing storage hierarchy of Db2, allowing existing applications to leverage the low-cost object storage for Db2 tables without changes.
To create a table stored in object storage, users will follow the existing process of creating a table in Db2. Tables are created within a storage hierarchy, with the object created in a table space, and the table space is associated with a storage group. The main difference introduced by this feature is in the definition of the storage group. It will now be defined as associated with a storage access alias, instead of being associated with a set of storage paths. This designates it as a storage group that is persisted to object storage, or a 'remote' storage group.
Subsequently, any table space associated with that remote storage group will be stored within the storage defined by the storage access alias associated with the storage group. These table spaces are referred to as remote table spaces. Similarly, any database object such as tables, indexes, MQTs, etc., associated with that remote table space will also be stored in the same object storage target and will be able to leverage the performance advantages of the multi-tier cache.
User Experience in IBM® Db2® Warehouse 11.5.9 Beta
IBM® Db2® Warehouse 11.5.9 Beta allows the user to enable this functionality at deployment time. There are a few steps in order to complete this configuration:
-
Set up the 2 pre-requisites: the cloud object storage bucket and the local storage for the caching tier.
-
Deploy the instance with Native Cloud Object Storage support enabled through the Db2uInstance Custom Resource (CR).
-
Configure the instance and database with the remote storage access alias, remote storage group and two remote table spaces - one regular table space OBJSTORESPACE1 and one user temporary table space OBJSTORESPACEUTMP1.
After these, these remote table spaces backed by Cloud Object Storage can be used to create both column-organized tables and column-organized Declared Global Temporary Tables in object storage.
In the following sections we describe each of these steps.
Pre-Requisites for Enabling Cloud Object Storage Support
There are two pre-requisites in order to configure Native Cloud:
-
A native cloud object storage bucket
-
Local storage for the caching tier
Setting up a Cloud Object Storage Bucket
IBM® Db2® Warehouse 11.5.9 Beta requires access to a Cloud Object Storage service that supports the AWS S3 data access model in order to configure the Native Cloud Object Storage support. This support includes Red Hat Ceph Storage, AWS S3, IBM Cloud Object Storage and MinIO.
The set up of a cloud object storage bucket is specific to each cloud object storage provider. Once this is set up, you will need your object storage connection information to complete the configuration of Native Cloud Object Storage Support:
-
The endpoint to the Cloud Object Storage service must be a host name or IP address.
-
A Cloud Object Storage bucket must be created, it must be all lower case and it is expected to be empty.
-
IBM® Db2® Warehouse 11.5.9 Beta supports both the use of explicit credentials for accessing the Cloud Object Storage service, or in the case of AWS S3 in Elastic Kubernetes Service (AWS EKS) it supports the use of an IAM role for implicit role based authentication.
-
Note that IBM® Db2® Warehouse 11.5.9 Beta does not allow the configuration of a custom certificate. If your Cloud Object Storage is configured to require a custom certificate it is not compatible with this release.
-
IBM® Db2® Warehouse 11.5.9 Beta does not require that your Cloud Object Storage be configured for high availability.
Setting up Local Storage for the Caching Tier
The Local Storage for the Caching Tier is usually configured with a set of locally attached fast NVMe drives to each of the nodes to be used during the deployment of IBM® Db2® Warehouse 11.5.9 Beta. These locally attached drives are expected to only be associated with the compute node they are attached to in order to get the highest performance from them for fast persistence and data caching.
Local Drive Performance Recommendations
The Local Storage for the Caching Tier is a critical component to the performance of the Native Cloud Object Storage support in IBM® Db2® Warehouse 11.5.9 Beta. For this reason, we recommend you follow these guidelines in order to achieve the best performance.
To test if your local drive meets the performance requirements, run the following command:
dd if=/dev/zero of=/path/on/local/disk/testfile bs=1G count=10 oflag=direct,dsync
Note: You must validate the local drive performance on all hosts.
While the previous command is running, run the following command in another terminal on the same system to monitor disk performance:
iostat -xyz 1
From the iostat output, you should observe high disk utilization, close to 100% (which is shown by the %util column in iostat). Your write throughput (shown by wkB/s) should be at least 600 MB/s, and your write latency (shown by w_await) should be under 20 ms.
The following is an example from a disk that meets these performance requirements:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdc 0.00 0.00 0.00 12555.00 0.00 803520.00 128.00 141.84 11.30 0.00 11.30 0.08 100.00
In this example, disk utilization is at 100% , write throughput is at 803 MB/s and write latency is at 11.3 ms.
For comparison, the following is an example from a disk that does not meet the performance requirements:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdc 0.00 20.00 0.00 2786.00 0.00 680715.00 488.67 150.09 53.85 0.00 53.85 0.36 100.00
In this example we see 100% disk utilization, a write throughput of 680 MB/s and a write latency of 53.85 ms. For this disk, the write latency is considered too high.
Configuring a Local Storage Class for the Caching Tier in OCP
Pre requisites
OCP worker nodes should have local block devices/disks or SSDs/NVMEs attached before the following setup is done
1. Add privileged SCC of the openshift cluster to the openebs Service Account
oc adm policy add-scc-to-user privileged system:serviceaccount:openebs:openebs-maya-operator
2. Deploy the openebs lite operator using the following:
oc apply -f https://openebs.github.io/charts/openebs-operator-lite.yaml
oc apply -f https://openebs.github.io/charts/openebs-lite-sc.yaml
3. Deploy the Storage Class using the following spec, BlockDeviceSelectors needs to use the NVME node group
cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-device
annotations:
openebs.io/cas-type: local
cas.openebs.io/config: |
- name: StorageType
value: device
- name: FSType
value: ext4
- name: BlockDeviceSelectors
data:
ndm.io/blockdevice-type: "blockdevice"
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
Enabling support for Native Cloud Object Storage
The Native Cloud Object Storage support is enabled first through the Db2uInstance Custom Resource (CR). The following example shows the multiple changes that must be made in the Db2UInstance CR in order to enable this functionality for the instance:
- aws: For Cloud Object Storage providers AWS S3 and IBM Cloud Object Storage, the cosProvider is set to “aws”. This is the default value and can be omitted.
- self-hosted: For other Cloud Object Storage providers like Ceph Object Storage using RADOS, ODF MCG, and MinIO, either when configured using SSL but not using CA signed certificate, and when not configured to use SSL, cosProvider field must be set to “self-hosted”.
spec:
version: s11.5.9.0
nodes: 2
addOns:
opendataformats:
enabled: true
advOpts:
enableCos: "true"
cosProvider: "self-hosted"
storage:
- name: cachingtier
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: local-device
type: create
Completing the Set Up of Native Cloud Object Storage Support
IBM® Db2® Warehouse 11.5.9 Beta provides a single step in order to configure all of the instance and database objects required to complete the configuration of Native Cloud Object Storage support. For this, a script is provided that will set up the storage access alias IBMDEFAULTALIAS, the creation of a remote storage group IBMDEFAULTREMSG1 associated with this remote storage access alias, and the creation of two remote table spaces associated with this remote storage access alias: one regular table space OBJSTORESPACE1 and one user temporary table space OBJSTORESPACEUTMP1.
To complete the set up of Native Cloud Object Storage support use the db2-cos-objects.sh script located under /db2u/scripts/ directory within the db2u head pod. This script is executed after opening a terminal in the Db2u Head Pod using ssh as user db2inst1.
Please make sure to review the pre-requisites for setting up a Cloud Object Storage bucket before proceeding with this step.
The following example shows the execution of the db2-cos-objects.sh script providing it with the fully qualified Cloud Object Storage service endpoint (including the prefix http/https), the bucket name (all lower case), and corresponding credentials (user name and password).
oc exec -it c-db2wh-mpp-db2u-0 /bin/bash
su - db2inst1
cd /db2u/scripts
./db2-cos-objects.sh run-all --server <> --username <> --password <> --bucket <>
The following are the parameters required for the configuration when using explicit authentication:
--server string Object Store server to use. Include http/https prefix.
--username string Username to use for authentication to Object Store server.
--password string Password to use for authentication to Object Store server.
--bucket string Name of bucket in the Object Store server.
-h, --help Display the help page.
Note that implicit authentication is only supported when configuring access to AWS S3 within AWS EKS when an IAM role is configured for role based authentication. In that case, server, username and password are not required.
db2-cos-objects.sh can be used to optionally set up Db2 Cloud Object Storage objects with custom names, including storage access alias, storage group, and table spaces, and also to override the folder within the cloud object storage bucket and to re-set the Cloud Object Storage service vendor with the following parameters:
--alias string Alias to use for cataloging storage access (default: "IBMDEFAULTALIAS").
--folder string Folder to use for Object Storage (default: "db2u").
--vendor string Vendor to use for Object Storage (default: "S3").
--storage-group string Name to use for storage group in Db2 (default: "IBMDEFAULTREMSG1").
--tablespace string Name to use for tablespace in Db2 (default: "OBJSTORESPACE1").
--user-tmp-tablespace string Name to use for user temporary tablespace in Db2 (default: "OBJSTORESPACEUTMP1").
Here is a sample output if the script is executed successfully:
./db2-cos-objects.sh run-all --server https://9.46.67.81:9000 --username admin --password adminpass --bucket db2whbucket
Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Cataloging storage access with the following command:
db2 CATALOG STORAGE ACCESS ALIAS IBMDEFAULTALIAS VENDOR S3 SERVER https://9.46.67.81:9000 USER admin PASSWORD <omitted> CONTAINER db2whbucket OBJECT db2u DBUSER db2inst1
DB20000I The CATALOG STORAGE ACCESS command completed successfully.
Creating storage group with the following command:
db2 "CREATE STOGROUP IBMDEFAULTREMSG1 ON 'DB2REMOTE://IBMDEFAULTALIAS/'"
DB20000I The SQL command completed successfully.
Creating tablespace with the following command:
db2 CREATE TABLESPACE OBJSTORESPACE1 USING STOGROUP IBMDEFAULTREMSG1
DB20000I The SQL command completed successfully.
Creating temporary tablespace with the following command:
db2 CREATE USER TEMPORARY TABLESPACE OBJSTORESPACEUTMP1 USING STOGROUP IBMDEFAULTREMSG1
DB20000I The SQL command completed successfully.
Deploying Native COS on AWS EKS
Pre requisites
Setup AWS EKS cluster following this tutorial until step 6 of section Deploy a database instance on your Amazon EKS cluster:
Note: In the section Choosing an Amazon EC2 instance type, choose the instance types from the following document to get nodes with NVME for optimized performance for Native COS:
https://aws.amazon.com/blogs/aws/new-amazon-ec2-r6id-instances/
After the prerequisite setup is completed, follow the steps listed below.
Configuring NVME local device setup for AWS
1. Deploy the openebs lite operator using the following:
kubectl apply -f https://openebs.github.io/charts/openebs-operator-lite.yaml
kubectl apply -f https://openebs.github.io/charts/openebs-lite-sc.yaml
2. Deploy the Storage Class using the following spec, BlockDeviceSelectors needs to use the NVME node group
cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-device
annotations:
openebs.io/cas-type: local
cas.openebs.io/config: |
- name: StorageType
value: device
- name: FSType
value: ext4
- name: BlockDeviceSelectors
data:
eks.amazonaws.com/nodegroup: "${NODE_GROUP}"
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF
Note: Replace ${NODE_GROUP} with the name that was selected for nodegroup when NVME node groups are created as part of pre-requisite tutorial.
AWS IAM Implicit Authentication setup for AWS EKS
Both implicit and explicit authentication is suported by Native Cloud Object Storage. In order to configure implicit authentication using AWS IAM Implicit Authentication for AWS EKS, follow the next steps:
1. Create s3 bucket using the following CLI command, replace BUCKET_NAME and AWS_REGION accordingly.
aws s3api create-bucket --bucket ${BUCKET_NAME} --region ${AWS_REGION} --create-bucket-configuration LocationConstraint=${AWS_REGION}
2. Create a policy file locally on terminal and name it, bucket-policy.json, with the following content. This allows all users in the account to access the bucket and perform the listed actions.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject",
"s3:GetObjectAcl",
"s3:PutObjectAcl",
"s3:DeleteObject",
"s3:DeleteObjectVersion"
],
"Resource": [
"arn:aws:s3:::${BUCKET_NAME}",
"arn:aws:s3:::${BUCKET_NAME}/*"
]
}
]
}
CLI command to create the policy
aws iam create-policy --policy-name Db2uS3AccessPolicy --policy-document file://bucket-policy.json
3. Create a trust-policy.json. In the following content, replace YOUR_ACCOUNT_ID, OIDC_PROVIDER_ID, NAMESPACE_OF_DB2U_DEPLOYMENT and SA_OF_DB2U_DEPLOYMENT.
The SA_OF_DB2U_DEPLOYMENT is the following:
- account-<NAMESPACE_OF_DB2U_DEPLOYMENT>-<DEPLOYMENT_NAME_FOR_DB2U>
- ii.DEPLOYMENT_NAME_FOR_DB2U comes from the name selected in the next section for deploying db2uinstance.yaml. For example, db2wh-mpp
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${YOUR_ACCOUNT_ID}:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}:sub": "system:serviceaccount:${NAMESPACE_OF_DB2U_DEPLOYMENT}:${SA_OF_DB2U_DEPLOYMENT}",
"oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}:aud": "sts.amazonaws.com"
}
}
}
]
}
aws iam create-role --role-name Db2uS3AccessRole --assume-role-policy-document file://trust-policy.json
4. Attach the policy to the role
aws iam attach-role-policy --policy-arn arn:aws:iam::${YOUR_ACCOUNT_ID}:policy/Db2uS3AccessPolicy --role-name Db2uS3AccessRole
Deploying db2uinstance.yaml using AWS IAM Implicit Authentication in AWS EKS
The following is an example of the db2uinstance.yaml to create a Db2UInstance CR with Native Cloud Object Storage Enabled and using a storage class for caching tier over locally-attached NVMe drives:
apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
name: db2wh-mpp
spec:
version: s11.5.9.0
nodes: 2
addOns:
opendataformats:
enabled: true
advOpts:
enableCos: "true"
cosProvider: "aws"
podTemplate:
db2u:
resource:
db2u:
limits:
cpu: 14
memory: 56Gi
environment:
dbType: db2wh
databases:
- name: BLUDB
partitionConfig:
total: 4
volumePerPartition: true
authentication:
ldap:
enabled: true
license:
accept: true
storage:
- name: meta
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: efs-test-sc
type: create
- name: archivelogs
type: create
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 25Gi
storageClassName: efs-test-sc
- name: data
type: template
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 300Gi
storageClassName: ebs-sc
- name: cachingtier
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 700Gi
storageClassName: local-device
type: template
- name: tempts
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: ebs-sc
type: template
- name: etcd
type: template
spec:
storageClassName: ebs-sc
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
- name: blulocal
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 70Gi
storageClassName: ebs-sc
type: template
Note: Immediately after applying the YAML above to create the Db2UInstance CR and before the db2u pods come up, edit the Service Account of the db2u deployment to add annotation for implicit authentication to work.
Use this command to edit the Service Account for the Db2U Deployment:
kubectl edit sa ${SA_OF_DB2U_DEPLOYMENT}
and add the following annotation for implicit authentication to work:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::${YOUR_ACCOUNT_ID}:role/Db2uS3AccessRole
This ensures that the admission controller of the AWS EKS will inject implicit authentication environment variables into the pods which will use this Service Account.
Completing the setup when using AWS IAM Implicit Authentication on AWS EKS
In the same way as before, the following example shows the execution of the db2-cos-objects.sh script. For AWS EKS Db2U deployment when using AWS IAM Implicit Authentication, the only required parameter is the bucket name:
kubectl exec -it c-db2wh-mpp-db2u-0 /bin/bash
su - db2inst1
cd /db2u/scripts
./db2-cos-objects.sh run-all --bucket ${BUCKET_NAME}
Troubleshooting the set up of Native COS table spaces
- Entering invalid inputs, such as an incorrect bucket name, will result in db2-cos-objects.sh script failing with the following error:
Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Cataloging storage access with the following command:
db2 CATALOG STORAGE ACCESS ALIAS whocwp8bucket VENDOR S3 SERVER https://9.30.272.59:9000 USER minioadddmin PASSWORD Nine8ddXML456$ CONTAINER whoceep8bucket OBJECT db2u DBUSER db2inst1
DB20000I The CATALOG STORAGE ACCESS command completed successfully.
List command failed with return code, rc = 0x870F01B6
BUCKET whoceep8bucket is not empty, will not create the tablespace costbsp1 and the whocwp8bucket will be dropped...
Uncataloging storage access with the following command:
db2 uncatalog storage access alias whocwp8bucket
DB20000I The UNCATALOG STORAGE ACCESS command completed successfully.
In this case, the script returns an error before creating the storage group and terminates after dropping the storage access alias. Rerunning the script with the correct parameters should work.
- If the object store endpoint is not accessible for any reason, db2-cos-objects.sh script will fail. Rerun the script and replace the sub command run-all with remove-all if the script fails after creating the storage group with the following error:
db2 CREATE TABLESPACE objstoretbsp USING STOGROUP stogroup
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0293N Error accessing a table space container. SQLSTATE=57048
ERROR: Failed to create tablespace
The following example shows how to run the script with the remove-all sub command. Running this command ensures that the environment is cleaned up properly as the provided parameter values for the existing alias and storage-group are removed.
./db2-cos-objects.sh remove-all --bucket <> --username <> --password <> --server <> --alias <> --storage-group <> --tablespace <> --folder <>
Following is the sample output if the script get executed successfully with the remove-all sub command:
Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Dropping temporary tablespace with the following command:
db2 "drop tablespace OBJSTORESPACEUTMP1"
DB20000I The SQL command completed successfully.
Dropping tablespace with the following command:
db2 "drop tablespace costbsp1"
DB20000I The SQL command completed successfully.
Dropping storage group with the following command:
db2 "drop stogroup stogroup1"
DB20000I The SQL command completed successfully.
Uncataloging storage access with the following command:
db2 "uncatalog storage access alias entrepotbucket2"
DB20000I The UNCATALOG STORAGE ACCESS command completed successfully.
After the environment is cleaned up, run the run-all sub command after the object store endpoint is accessible:
./db2-cos-objects.sh run-all --bucket <> --username <> --password <> --server <> --alias <> --storage-group <> --tablespace <> --folder <>
Limitations
-
Fresh installs only. No update or upgrade.
-
The user must use db2u-head-engn-svc service to always point to catalog node and not use db2u-engn-svc which can redirect to any engine pod.
-
No backup/restore
-
Default Db2 Warehouse configuration should not be altered.
-
Native COS supports 1 object store table space and 1 user temporary table space.
-
The object store bucket being used for Native COS must be empty and cannot be shared with datalake tables.
-
Supported operations for datalake tables include SELECT, CTAS and INSERT.
-
If a user creates a db2 table as db2inst1 user but does not specify a table space, then the table will be created on the default table space. USERSPACE1 table space spans across all nodes. It is recommended to explicitly use USERSPACE1 or a table space that spans all nodes when creating db2 tables.
For more information on restrictions and limitations for datalake tables, see https://www.ibm.com/docs/en/db2woc?topic=tables-restrictions-limitations
For more information on restrictions and limitations for Native COS table spaces, see https://www.ibm.com/docs/en/db2woc?topic=support-restrictions-limitations
Was this topic helpful?
Document Information
Modified date:
31 August 2023
UID
ibm17028563