IBM Support

Db2 Warehouse 11.5.9 BETA

Release Notes


Abstract

Db2 Warehouse 11.5.9 BETA

Content

The Db2 Warehouse 11.5.9 Beta release supports the use of DATALAKE tables, providing users with access to data that is stored in open data formats like PARQUET and ORC.

For more information, see https://www.ibm.com/docs/en/db2woc?topic=using-datalake-tables

The Db2 Warehouse 11.5.9 Beta release also supports Native Cloud Object Storage, allowing users to store in object storage traditional Db2 column-organized tables in Db2's native format while maintaining the existing SQL support and performance using a tiered storage architecture.

For more information, see https://www.ibm.com/docs/en/db2woc?topic=native-cloud-object-storage-support

How to install Db2U 11.5.9 Standalone Beta

Before you Begin

  • Install the Db2 Operator using the IBM Cloud Pak CLI tool.

  • Install the ibm-pak tool.

Procedure:

WARNING:

Installing the 11.5.9 beta on a cluster which already has Db2U Standalone installed will overwrite the existing catalogsource in the openshift-marketplace. Both operators should function fine as long as they're installed in different namespaces. The catalog must be installed in openshift-marketplace

WARNING:
It's recommended to install the 11.5.9 operator in a separate namespace from any existing db2u-operator install.

Setup some environment variables that can be used:

export CASE_VERSION="5.4.0+20230815.111832.10188.6946"
export CASE_CACHE_IMAGE_DIGEST="sha256:d1ff1a4ac6d1a135ad2378f9116fba97034865ac60e4f035602dbea576da07fe"

Use ibm-pak to pull the CASE locally:

oc ibm-pak get docker://icr.io/db2u/ibm-db2uoperator-case-cache@${CASE_CACHE_IMAGE_DIGEST}

This should create a directory in ibm-pak's home directory, this is typically:

${HOME}/.ibm-pak/data/cases/<CASE_NAME>/<CASE_VERSION>

ie:

/root/.ibm-pak/data/cases/ibm-db2uoperator/5.4.0+20230815.111832.10188.6946

Go to this newly created directory, you should see the CASE bundle and images csv have been downloaded:

# cd ${HOME}/.ibm-pak/data/cases/ibm-db2uoperator/${CASE_VERSION}
# ls -latr
total 332
-rw-r--r-- 1 root root     78 Aug 24 07:40 caseDependencyMapping.csv
-rw-r--r-- 1 root root     32 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-charts.csv
-rw-r--r-- 1 root root   3896 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-airgap-metadata.yaml
drwxr-xr-x 2 root root      6 Aug 24 07:40 charts
drwxr-xr-x 2 root root     50 Aug 24 07:40 resourceIndexes
-rw-r--r-- 1 root root 310896 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946.tgz
-rw-r--r-- 1 root root   4439 Aug 24 07:40 ibm-db2uoperator-5.4.0+20230815.111832.10188.6946-images.csv
drwx------ 4 root root   4096 Aug 24 07:40 .
drwxr-xr-x 4 root root     90 Aug 24 07:40 ..
-rw-r--r-- 1 root root    276 Aug 24 07:40 component-set-config.yaml

Note: If you are installing the Db2 Operator in an air-gapped environment, you'll have to copy the contents of ${HOME}/.ibm-pak/data/cases/ibm-db2uoperator/${CASE_VERSION} directory to your offline directory before mirroring the images

Now we can install the catalog and operator.

1. Install the Db2 catalog:

oc ibm-pak launch \
    --case ibm-db2uoperator-${CASE_VERSION}.tgz \
    --namespace openshift-marketplace \
    --inventory db2uOperatorSetup \
    --action installCatalog \
    --tolerance=1

2. Install the Db2 operator

oc ibm-pak launch \
     --case ibm-db2uoperator-${CASE_VERSION}..tgz \
     --namespace ${NS} \
     --inventory db2uOperatorWhOnlySetup \
     --action installOperator \
     --tolerance=1

Deploying Db2 Warehouse with the Db2uInstance Custom Resource

The Db2uInstance Custom Resource (CR) provides the interface required to deploy Db2 Warehouse and enable support for DATALAKE tables and Native Cloud Object Storage. For more information, see https://www.ibm.com/docs/en/db2-warehouse?topic=resource-deploying-db2-warehouse-using-db2uinstance-custom

Enabling and disabling support for DATALAKE tables

The addOns.opendataformats.enabled Db2uInstance CR variable is used to enable or disable DATALAKE table support. The default value is true, as shown in the example below. Setting this value to false disables DATALAKE table support.

apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
  name: db2wh-mpp
spec:
  version: s11.5.9.0
  nodes: 2
  addOns:
    opendataformats:
      enabled: true

Configuring memory distribution for DATALAKE tables

Memory distribution for DATALAKE tables can be set through the workloadProfile parameter. Two options are available:

  • The default option is intended for a workload where most of query access is for native Db2 tables.

  • The balanced option gives more memory to DATALAKE tables and less memory to Db2 tables. This option handles workloads with larger data sets and concurrency requirements on DATALAKE access.

The following example shows the command syntax for setting a balanced memory distribution:

spec:
  version: s11.5.9.0
  nodes: 2
  addOns:
    opendataformats:
      enabled: true
      workloadProfile: balanced

Enabling support for Native Cloud Object Storage

The following example shows how to enable support for Native Cloud Object storage through the Db2uInstance CR. In that example, you can see that there are three new elements:

  1. The advanced option advOpts.enableCos: true is the toggle to enable the functionality.

  2. The advanced option advOpts.cosProvider: “<cos-provider>” is specified to indicate the target Cloud Object Storage service.

  3. A new storage section “cachingtier” for the Local Storage for the Caching Tier is completed, as it is a mandatory component in order to enable this support.

The requirements for the configuration of both the COS Provider and the Caching Tier storage are discussed in more detail in the following sections:

apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
  name: db2wh-mpp
spec:
  version: s11.5.9.0
...
  advOpts:
    enableCos: "true"
    cosProvider: "aws"
...
  storage:
...
    - name: cachingtier
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 700Gi
        storageClassName: local-device
      type: template
...

Configuring Instance memory and CPU consumption

The following example shows the command syntax for setting the CPU limits:

spec:
  podTemplate:
    db2u:
      resource:
        db2u:
          limits:
            cpu: 8
            memory: 24Gi

Configuring default database settings

Do not override the following default database settings:

  • dftTableOrg: "COLUMN"

  • dftPageSize: "32768"

  • encrypt: "YES"

  • codeset: "UTF-8"

  • territory: "US"

  • collation: "IDENTITY"  

Note: Oracle compatibility features are not supported for DATALAKE tables.

These default values are set even when they are not specified in the YAML file. Use these default values for database configuration parameters and registry variables. If no values are specified in your YAML code, the default values are used.

The following example shows the command syntax for setting the following values:

  • The database name.

  • Any database settings that should not be altered.

  • The database configuration.

  • Enabling LDAP authentication.

spec:
  environment:
    dbType: db2wh
    databases:
      - name: BLUDB
        settings:
          dftTableOrg: "COLUMN"
          dftPageSize: "32768"
        dbConfig:
          LOGPRIMARY: "20"
          LOGSECOND: "30"
    authentication:
      ldap:
        enabled: true

Setting up database partitions

This example provides us a Db2 Warehouse MPP instance of 2 nodes with 4 multiple logical nodes.

spec:
  version: s11.5.9.0
  nodes: 2
  environment:
    partitionConfig:
      total: 4

Note: spec.environment.partitionConfig.volumePerPartition is set to ‘true’ by default if dbType is ‘db2wh’.

Example of a complete Db2uInstance CR

The CR below creates a Db2 Warehouse instance with the following configuration:

  • Database name: BLUDB.

  • 14 CPUs.

  • 56 Gi of memory.

  • 2 nodes with 4 multiple logical nodes.

  • Open Data Formats enabled

  • Native COS enabled

  • AWS S3 as object storage provider

  • Default memory configuration

  • LDAP enabled

  • 6 storage volumes (meta, archive logs, data, cachingtier, tempts, etcd and blulocal).

apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
  name: db2wh-mpp
spec:
  version: s11.5.9.0
  nodes: 2
  addOns:
    opendataformats:
      enabled: true
  advOpts:
    enableCos: "true"
    cosProvider: "aws"
  podTemplate:
    db2u:
      resource:
        db2u:
          limits:
            cpu: 14
            memory: 56Gi
  environment:
    dbType: db2wh
    databases:
      - name: BLUDB
    partitionConfig:
      total: 4
      volumePerPartition: true
    authentication:
      ldap:
        enabled: true
  license:
    accept: true
  storage:
    - name: meta
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 10Gi
        storageClassName: ocs-storagecluster-cephfs
      type: create
    - name: archivelogs
      type: create
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 25Gi
        storageClassName: ocs-storagecluster-cephfs
    - name: data
      type: template
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 300Gi
        storageClassName: ocs-storagecluster-ceph-rbd
    - name: cachingtier
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
        storageClassName: ocs-storagecluster-ceph-rbd
      type: template
    - name: tempts
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: ocs-storagecluster-ceph-rbd
      type: template
    - name: etcd
      type: template
      spec:
        storageClassName: ocs-storagecluster-ceph-rbd
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
    - name: blulocal 
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 70Gi
        storageClassName: ocs-storagecluster-ceph-rbd
      type: template

Connecting DATALAKE tables to your Db2 Warehouse instance

The user must use db2u-head-engn-svc service to always point to catalog node and not use db2u-engn-svc which can redirect to any engine pod.

In the example below, port 30920 can be used to connect to the db2 instance if we want to connect from outside the cluster. To connect from within the cluster, the service name and its internal port is used instead: c-entrepot-db2u-head-engn-svc:50000.

# oc get svc
NAME                            TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                                           AGE
c-entrepot-db2u                 ClusterIP   172.30.90.235    <none>        50000/TCP,50001/TCP,25000/TCP,25001/TCP,25002/TCP,25003/TCP,25004/TCP,25005/TCP   2d15h
c-entrepot-db2u-engn-svc        NodePort    172.30.68.205    <none>        50000:30956/TCP,50001:30288/TCP                                                   2d15h
c-entrepot-db2u-head-engn-svc   NodePort    172.30.254.111   <none>        50000:30920/TCP,50001:30604/TCP                                                   2d15h

Supported Storage Options For Database Storage

Storage Provider

Deployment Requirements

OpenShift® Data Foundation 4.10 - 4.12

For more information, see the OpenShift documentation.

Split storage:

System/backup data

  • ocs-storagecluster-cephfs storage class

  • ReadWriteMany (RWX)

User data

  • ocs-storagecluster-ceph-rbd storage class

  • 4K sector size

  • ReadWriteOnce (RWO)

NFS V3

NFS storage requirements for Db2

Refer to the Native Cloud Object Storage section for the requirements for configuring Local Storage for the Caching Tier.

Kubernetes providers

Red Hat OpenShift Container Platform

DATALAKE Tables Support

Please refer to the following document for information related to using DATALAKE tables
https://www.ibm.com/docs/en/db2woc?topic=using-datalake-tables

Supported Object Stores

The following object storage services that are compatible with the Amazon S3 API are supported:

  • Ceph: Note that Ceph is not supported with SSL enabled for DATALAKE tables.

  • AWS S3

  • IBM Cloud Object Storage

Supported DATALAKE Table Types

File Formats

DATALAKE Table

DATALAKE Iceberg Table

Parquet

Yes

Yes

Optimized Row Columnar (ORC)

Yes

Yes

Text file

Yes

Not supported by Iceberg

Setting up storage alias for DATALAKE table

Before any user can create or access a DATALAKE table, a storage access alias must be created by a database administrator to establish connectivity to the remote storage where the table resides.

About this task

When the storage access alias is created, the remote storage account credentials are safely stored in an encrypted keystore.

In addition to storing the credentials for storage connectivity, the alias can optionally include a location in the bucket where table will be created.

The usage of the storage alias to create a table is subject to access control and can be granted to a single user, a user group or a database role.

Before you begin

You must have database administration authority (DBADM) to create a storage alias.

A bucket must be provisioned on one of the supported Object Storage services before the storage alias is created.

The following information must be available to create the storage alias:

  • The name of the bucket

  • The API endpoint used to access the bucket

  • The credentials to access the bucket. The credentials must have read AND write access to allow table creation

Procedure:

Create a storage alias with the following parameters to access the DATALAKE tables:

alias-name

Specifies the new storage alias name.

vendor

Specifies the type of the remote storage. For accessing DATALAKE tables, only ‘S3’ is supported.

server

Specifies the remote storage endpoint to access the bucket.

user

Specifies the S3 access key of the remote storage account.

password

Specifies the S3 secret key of the remote storage account.

container

Specifies the name of the S3 bucket where the tables will be created or accessed.

object

[Optional] Specifies a parent directory in the bucket for the DATALAKE table. If a path is specified for the object parameter, the DATALAKE table will have to be located in a subdirectory of that path.

grantee-type

Specifies the type of grantee. Use ‘U’ for a user ID, ‘G’ for a group name and ‘R’ for a database role name.

grantee

The user, group, or role granted usage access of the storage alias. Only users with this access will be allowed to use the storage alias to create a DATALAKE table.

CALL SYSIBMADM.STORAGE_ACCESS_ALIAS.CATALOG('alias-name', 'vendor', 'server', 'user', 'password', 'container', 'object', 'grantee-type', 'grantee')

NOTE: Since Ceph is not supported with SSL enabled, the following command need to be executed before creating a datalake table when using Ceph object store:

CALL SYSHADOOP.SET_DATALAKE_CONFIG('CORE', 'fs.s3a.bucket.<bucket_name>.connection.ssl.enabled', 'false') 

Creating a DATALAKE table with storage alias

Once a storage alias has been created, a user who has been granted access to the storage alias use it to create a DATALAKE table. When creating a DATALAKE table, a LOCATION clause must be specified. The syntax of the LOCATION clause must be as follows:

DB2REMOTE://alias-name//object/path

alias-name

Must be the name of an alias to which the user that is creating the table has access. The alias contains the information required to access the container where the table data is to reside.

object

Must be the object specified when creating the storage alias. It represents a path on the container. If no object was specified when creating the storage alias, this part can remain empty.

path

An optional additional path. Appended to the object path, it represents the location of the table data in the container.

For more information, see the CREATE DATALAKE TABLE statement.

For information on Security considerations for DATALAKE tables, see https://www.ibm.com/docs/en/db2woc?topic=tables-remote-storage-connectivity#db2woc_dl_storage_conn__title__5

Accessing watsonx.data on Cloud

For information on accessing watsonx.data, see Accessing watsonx.data

Configuring Db2 for watsonx.data integration

To configure integration between Db2 and watsonx.data, see https://www.ibm.com/docs/en/db2woc?topic=watsonxdata-configuring-db2-integration

Importing tables from watsonx.data

After registering a watsonx.data metastore in Db2, it is possible to import Iceberg tables from watsonx.data into Db2. The table is then accessible from both systems.

To import a table created in a watsonx.data metastore into Db2, you can run the following statement:

CALL EXTERNAL_CATALOG_SYNC('metastore-name', 'schema-name', 'table-name',
'exist-action', 'error-action', NULL)

For example, to import all tables in schema icebergdemo, use:

CALL EXTERNAL_CATALOG_SYNC('watsonxdata', 'icebergdemo', '.*', 'REPLACE', 'STOP', '')

For more information, see Importing tables from watsonx.data

Note: SELECT or INSERT might fail after calling EXTERNAL_CATALOG_SYNC. You might receive the following error:

SQL5105N  The statement failed because a Big SQL component encountered an
error.  Component receiving the error: "BigSQL IO".  Component returning the
error: "UNKNOWN".  Log entry identifier: "[BSL-1-1f4690d0b]".  Reason: "Table
does not exist: icebergl".  SQLSTATE=58040

It is recommended to restart bigsql to resolve this error. Following is an example to restart bigsql from the db2u head pod.

$ wvcli system disable
Wolverine HA management state was disabled successfully.
$ bigsql stop
Stopping Standalone Metastore  : OK
Stopping Big SQL               : OK
Stopping Big SQL Scheduler     : OK
$ bigsql start
Global config update           : OK
Starting Big SQL Scheduler     : OK
Starting Big SQL               : OK
Starting Standalone Metastore  : OK
$ wvcli system enable
Wolverine HA management state was enabled successfully.

Check bigsql status and verify that metastore is in listening state before connecting to the database and running the queries.

$ bigsql status
SERVICE              HOSTNAME                               NODE      PID STATUS
Big SQL Worker       c-entrepot-db2u-1.c-entrepot-db2u-internal    2   106073 DB2 Running
Big SQL Worker       c-entrepot-db2u-1.c-entrepot-db2u-internal    3   106072 DB2 Running
Big SQL Master       c-entrepot-db2u-0.c-entrepot-db2u-internal    0  2441832 DB2 Running
Big SQL Worker       c-entrepot-db2u-0.c-entrepot-db2u-internal    1  2441834 DB2 Running
Standalone Metastore c-entrepot-db2u-0.c-entrepot-db2u-internal    -  2442926 Available (listening)
Big SQL Scheduler    c-entrepot-db2u-0.c-entrepot-db2u-internal    -  2432240 Available

Creating Iceberg tables in watsonx.data

Creating Iceberg tables in watsonx.data

Disconnecting Db2 from watsonx.data

It is possible to disconnect Db2 from watsonx.data by unregistering the watsonx.data metastore from Db2 and deleting the access storage alias or aliases used by Db2 to connect to the watsonx.data object storage container.

Unregistering the watsonx.data metastore

To unregister the watsonx.data metastore in Db2, run the following command:

CALL UNREGISTER_EXT_METASTORE('metastore-name', ?, ?)

with the following parameters:

metastore-name

The name under which the metastore was registered.

After the metastore has been unregistered, the watsonx.data metastore is not accessible and the following actions fail:

  • Running the EXTERNAL_CATALOG_SYNC() stored procedure to import tables or refresh their schema.

  • Issuing a SELECT or INSERT statement from Db2 on an Iceberg table created in watsonx.data.

Important: Due to the nature of non-Iceberg DATALAKE tables, issuing a SELECT or INSERT statement against them continues to be successful. However, SELECT or INSERT statements will not be successful when the access to the underlying storage is revoked after deleting the access storage alias used to connect to the watsonx.data object storage container.

Deleting the access storage alias

To completely disconnect watsonx.data from Db2, you must also delete the access storage alias used to connect to the watsonx.data object storage container.

To delete the storage access alias, see Deleting the storage alias.

Native Cloud Object Storage support

IBM Db2 Warehouse 11.5.9 Beta introduces native object storage support to Db2, allowing users to store in object storage traditional Db2 column-organized tables in Db2's native format while maintaining the existing SQL support and performance using a tiered storage architecture.

Storage Architecture with Native Cloud Object Support


Support for native cloud object storage introduces a new multi-tier storage architecture for table spaces that allows the native storage of data pages for data base objects (for example, column-organized tables) in cloud object storage.

For more information on table spaces, see Table Spaces.

In the diagram below we show the multiple components that are part of the storage architecture of a typical IBM® Db2® Warehouse deployment model, in this example with 2 compute nodes and 4 database partitions on each: the first compute node runs database partitions 0, 1, 2 and 3, and the second compute node runs database partitions 4, 5, 6, and 7. In the diagram below you can see the three levels of storage that are part of this architecture that supports cloud object storage:

  • Reliable durable file storage is associated with each of the Db2 Database Partitions - this is an existing element in the IBM Db2 Warehouse architecture that is maintained in the current generation to enable a simple path towards adoption of the object storage medium for existing databases.

  • A new layer of local storage in fast SSD/NVMe drives that are only associated with the compute node they are attached to, and that as the diagram shows are used as a local cache for the objects in object store. In the diagram we show multiple local disks, one per Db2 Database Partition, just to show that each Db2 Database Partitions manages its own portion of the local cache independently from the others.

  • In the object storage layer, there are objects within the object storage layer that are associated with each of the database partitions.

For more information, see Storage Architecture with Native Cloud Object Support

User experience for Native Cloud Object Storage

The Native Cloud Object Storage support provides a simple path for adoption through its direct integration within the existing storage hierarchy of Db2, allowing existing applications to leverage the low-cost object storage for Db2 tables without changes.

To create a table stored in object storage, users will follow the existing process of creating a table in Db2. Tables are created within a storage hierarchy, with the object created in a table space, and the table space is associated with a storage group. The main difference introduced by this feature is in the definition of the storage group. It will now be defined as associated with a storage access alias, instead of being associated with a set of storage paths. This designates it as a storage group that is persisted to object storage, or a 'remote' storage group.

Subsequently, any table space associated with that remote storage group will be stored within the storage defined by the storage access alias associated with the storage group. These table spaces are referred to as remote table spaces. Similarly, any database object such as tables, indexes, MQTs, etc., associated with that remote table space will also be stored in the same object storage target and will be able to leverage the performance advantages of the multi-tier cache.

User Experience in IBM® Db2® Warehouse 11.5.9 Beta

IBM® Db2® Warehouse 11.5.9 Beta allows the user to enable this functionality at deployment time. There are a few steps in order to complete this configuration:

  1. Set up the 2 pre-requisites: the cloud object storage bucket and the local storage for the caching tier.

  2. Deploy the instance with Native Cloud Object Storage support enabled through the Db2uInstance Custom Resource (CR).

  3. Configure the instance and database with the remote storage access alias, remote storage group and two remote table spaces - one regular table space OBJSTORESPACE1 and one user temporary table space OBJSTORESPACEUTMP1.

After these, these remote table spaces backed by Cloud Object Storage can be used to create both column-organized tables and column-organized Declared Global Temporary Tables in object storage.

In the following sections we describe each of these steps.

Pre-Requisites for Enabling Cloud Object Storage Support

There are two pre-requisites in order to configure Native Cloud:

  1. A native cloud object storage bucket

  2. Local storage for the caching tier

Setting up a Cloud Object Storage Bucket

IBM® Db2® Warehouse 11.5.9 Beta requires access to a Cloud Object Storage service that supports the AWS S3 data access model in order to configure the Native Cloud Object Storage support. This support includes Red Hat Ceph Storage, AWS S3, IBM Cloud Object Storage and MinIO.

The set up of a cloud object storage bucket is specific to each cloud object storage provider. Once this is set up, you will need your object storage connection information to complete the configuration of Native Cloud Object Storage Support:

  • The endpoint to the Cloud Object Storage service must be a host name or IP address.

  • A Cloud Object Storage bucket must be created, it must be all lower case and it is expected to be empty.

  • IBM® Db2® Warehouse 11.5.9 Beta supports both the use of explicit credentials for accessing the Cloud Object Storage service, or in the case of AWS S3 in Elastic Kubernetes Service (AWS EKS) it supports the use of an IAM role for implicit role based authentication.

  • Note that IBM® Db2® Warehouse 11.5.9 Beta does not allow the configuration of a custom certificate. If your Cloud Object Storage is configured to require a custom certificate it is not compatible with this release.

  • IBM® Db2® Warehouse 11.5.9 Beta does not require that your Cloud Object Storage be configured for high availability.

Setting up Local Storage for the Caching Tier

The Local Storage for the Caching Tier is usually configured with a set of locally attached fast NVMe drives to each of the nodes to be used during the deployment of IBM® Db2® Warehouse 11.5.9 Beta. These locally attached drives are expected to only be associated with the compute node they are attached to in order to get the highest performance from them for fast persistence and data caching.

Local Drive Performance Recommendations

The Local Storage for the Caching Tier is a critical component to the performance of the Native Cloud Object Storage support in IBM® Db2® Warehouse 11.5.9 Beta. For this reason, we recommend you follow these guidelines in order to achieve the best performance.

To test if your local drive meets the performance requirements, run the following command:

dd if=/dev/zero of=/path/on/local/disk/testfile bs=1G count=10 oflag=direct,dsync

Note: You must validate the local drive performance on all hosts.

While the previous command is running, run the following command in another terminal on the same system to monitor disk performance:

iostat -xyz 1

From the iostat output, you should observe high disk utilization, close to 100% (which is shown by the %util column in iostat). Your write throughput (shown by wkB/s) should be at least 600 MB/s, and your write latency (shown by w_await) should be under 20 ms.

The following is an example from a disk that meets these performance requirements:

Device:         rrqm/s   wrqm/s     r/s      w/s      rkB/s     wkB/s     avgrq-sz avgqu-sz   await   r_await  w_await  svctm  %util
sdc              0.00     0.00      0.00   12555.00    0.00    803520.00   128.00   141.84    11.30     0.00    11.30    0.08  100.00

In this example, disk utilization is at 100% , write throughput is at 803 MB/s and write latency is at 11.3 ms.

For comparison, the following is an example from a disk that does not meet the performance requirements:

Device:          rrqm/s   wrqm/s     r/s     w/s      rkB/s     wkB/s      avgrq-sz  avgqu-sz   await   r_await  w_await  svctm  %util
sdc               0.00    20.00      0.00   2786.00    0.00   680715.00     488.67    150.09    53.85     0.00    53.85    0.36  100.00

In this example we see 100% disk utilization, a write throughput of 680 MB/s and a write latency of 53.85 ms. For this disk, the write latency is considered too high.

Configuring a Local Storage Class for the Caching Tier in OCP

Pre requisites

OCP worker nodes should have local block devices/disks or SSDs/NVMEs attached before the following setup is done

1. Add privileged SCC of the openshift cluster to the openebs Service Account

oc adm policy add-scc-to-user privileged system:serviceaccount:openebs:openebs-maya-operator

2. Deploy the openebs lite operator using the following:

oc apply -f https://openebs.github.io/charts/openebs-operator-lite.yaml 
oc apply -f https://openebs.github.io/charts/openebs-lite-sc.yaml

3. Deploy the Storage Class using the following spec, BlockDeviceSelectors needs to use the NVME node group

cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-device
  annotations:
    openebs.io/cas-type: local
    cas.openebs.io/config: |
      - name: StorageType
        value: device
      - name: FSType
        value: ext4
      - name: BlockDeviceSelectors
        data:
           ndm.io/blockdevice-type: "blockdevice"
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF

Enabling support for Native Cloud Object Storage

The Native Cloud Object Storage support is enabled first through the Db2uInstance Custom Resource (CR). The following example shows the multiple changes that must be made in the Db2UInstance CR in order to enable this functionality for the instance:

1. Firstly, the new advanced option enableCos that must be set to true in order to enable the functionality at deployment time.
2. Secondly, the cloud object storage provider can be optionally configured through the advanced option cosProvider field.
  • aws: For Cloud Object Storage providers AWS S3 and IBM Cloud Object Storage, the cosProvider is set to “aws”. This is the default value and can be omitted.
  • self-hosted: For other Cloud Object Storage providers like Ceph Object Storage using RADOS, ODF MCG, and MinIO, either when configured using SSL but not using CA signed certificate, and when not configured to use SSL, cosProvider field must be set to “self-hosted”.
spec:
  version: s11.5.9.0
  nodes: 2
  addOns:
    opendataformats:
      enabled: true
  advOpts:
    enableCos: "true"
    cosProvider: "self-hosted"

3. Third, the Local Disk for the Caching Tier must be configured as part of the storage configuration under the new “cachingtier” storage section, providing both the size and the storage class that manages the local NVMe drives on the target nodes. In the following example we show that the caching tier is configures with a size of 100Gi and using a sample storage class name local_NVMe_drive_storage_class. The actual name and configuration of the storage class is specific to the deployment environment.

  storage:
    - name: cachingtier
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 100Gi
        storageClassName: local-device
      type: create

Completing the Set Up of Native Cloud Object Storage Support

IBM® Db2® Warehouse 11.5.9 Beta provides a single step in order to configure all of the instance and database objects required to complete the configuration of Native Cloud Object Storage support. For this, a script is provided that will set up the storage access alias IBMDEFAULTALIAS, the creation of a remote storage group IBMDEFAULTREMSG1 associated with this remote storage access alias, and the creation of two remote table spaces associated with this remote storage access alias: one regular table space OBJSTORESPACE1 and one user temporary table space OBJSTORESPACEUTMP1.

To complete the set up of Native Cloud Object Storage support use the db2-cos-objects.sh script located under /db2u/scripts/ directory within the db2u head pod. This script is executed after opening a terminal in the Db2u Head Pod using ssh as user db2inst1.

Please make sure to review the pre-requisites for setting up a Cloud Object Storage bucket before proceeding with this step.

The following example shows the execution of the db2-cos-objects.sh script providing it with the fully qualified Cloud Object Storage service endpoint (including the prefix http/https), the bucket name (all lower case), and corresponding credentials (user name and password).

oc exec -it c-db2wh-mpp-db2u-0 /bin/bash
su - db2inst1
cd /db2u/scripts
./db2-cos-objects.sh run-all --server <> --username <> --password <> --bucket <>

The following are the parameters required for the configuration when using explicit authentication:

--server              string    Object Store server to use. Include http/https prefix.

--username            string    Username to use for authentication to Object Store server.

--password            string    Password to use for authentication to Object Store server. 

--bucket              string    Name of bucket in the Object Store server.

-h, --help                      Display the help page.

Note that implicit authentication is only supported when configuring access to AWS S3 within AWS EKS when an IAM role is configured for role based authentication. In that case, server, username and password are not required.

db2-cos-objects.sh can be used to optionally set up Db2 Cloud Object Storage objects with custom names, including storage access alias, storage group, and table spaces, and also to override the folder within the cloud object storage bucket and to re-set the Cloud Object Storage service vendor with the following parameters:

--alias               string    Alias to use for cataloging storage access (default: "IBMDEFAULTALIAS").

--folder              string    Folder to use for Object Storage (default: "db2u").

--vendor              string    Vendor to use for Object Storage (default: "S3").

--storage-group       string    Name to use for storage group in Db2 (default: "IBMDEFAULTREMSG1").

--tablespace          string    Name to use for tablespace in Db2 (default: "OBJSTORESPACE1").

--user-tmp-tablespace string    Name to use for user temporary tablespace in Db2 (default: "OBJSTORESPACEUTMP1").

Here is a sample output if the script is executed successfully:

./db2-cos-objects.sh run-all --server https://9.46.67.81:9000 --username admin --password adminpass --bucket db2whbucket 

Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Cataloging storage access with the following command:
 db2 CATALOG STORAGE ACCESS ALIAS IBMDEFAULTALIAS VENDOR S3 SERVER https://9.46.67.81:9000 USER admin PASSWORD <omitted> CONTAINER db2whbucket OBJECT db2u DBUSER db2inst1
DB20000I  The CATALOG STORAGE ACCESS command completed successfully.
Creating storage group with the following command:
 db2 "CREATE STOGROUP IBMDEFAULTREMSG1 ON 'DB2REMOTE://IBMDEFAULTALIAS/'"
DB20000I  The SQL command completed successfully.
Creating tablespace with the following command:
 db2 CREATE TABLESPACE OBJSTORESPACE1 USING STOGROUP IBMDEFAULTREMSG1
DB20000I  The SQL command completed successfully.
Creating temporary tablespace with the following command:
 db2 CREATE USER TEMPORARY TABLESPACE OBJSTORESPACEUTMP1 USING STOGROUP IBMDEFAULTREMSG1
DB20000I  The SQL command completed successfully.

Deploying Native COS on AWS EKS

Pre requisites

Setup AWS EKS cluster following this tutorial until step 6 of section Deploy a database instance on your Amazon EKS cluster:

https://www.ibm.com/support/pages/tutorial-adding-database-instance-aws-eks-cluster-using-db2-operator-0

Note: In the section Choosing an Amazon EC2 instance type, choose the instance types from the following document to get nodes with NVME for optimized performance for Native COS:
https://aws.amazon.com/blogs/aws/new-amazon-ec2-r6id-instances/

After the prerequisite setup is completed, follow the steps listed below.

Configuring NVME local device setup for AWS

1. Deploy the openebs lite operator using the following:

kubectl apply -f https://openebs.github.io/charts/openebs-operator-lite.yaml 
kubectl apply -f https://openebs.github.io/charts/openebs-lite-sc.yaml

2. Deploy the Storage Class using the following spec, BlockDeviceSelectors needs to use the NVME node group

cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-device
  annotations:
    openebs.io/cas-type: local
    cas.openebs.io/config: |
      - name: StorageType
        value: device
      - name: FSType
        value: ext4
      - name: BlockDeviceSelectors
        data:
           eks.amazonaws.com/nodegroup: "${NODE_GROUP}"
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
EOF

Note: Replace ​${NODE_GROUP} with the name that was selected for nodegroup when NVME node groups are created as part of pre-requisite tutorial.

AWS IAM Implicit Authentication setup for AWS EKS

Both implicit and explicit authentication is suported by Native Cloud Object Storage. In order to configure implicit authentication using AWS IAM Implicit Authentication for AWS EKS, follow the next steps:

1. Create s3 bucket using the following CLI command, replace BUCKET_NAME and AWS_REGION accordingly.

aws s3api create-bucket --bucket ${BUCKET_NAME} --region ${AWS_REGION} --create-bucket-configuration LocationConstraint=${AWS_REGION}

2. Create a policy file locally on terminal and name it, bucket-policy.json, with the following content. This allows all users in the account to access the bucket and perform the listed actions.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject",
        "s3:PutObject",
        "s3:GetObjectAcl",
        "s3:PutObjectAcl",
        "s3:DeleteObject",
        "s3:DeleteObjectVersion"
      ],
      "Resource": [
        "arn:aws:s3:::${BUCKET_NAME}",
        "arn:aws:s3:::${BUCKET_NAME}/*"
      ]
    }
  ]
}

CLI command to create the policy

aws iam create-policy --policy-name Db2uS3AccessPolicy --policy-document file://bucket-policy.json

3. Create a trust-policy.json. In the following content, replace YOUR_ACCOUNT_ID, OIDC_PROVIDER_ID, NAMESPACE_OF_DB2U_DEPLOYMENT and SA_OF_DB2U_DEPLOYMENT.

The SA_OF_DB2U_DEPLOYMENT is the following:

  • account-<NAMESPACE_OF_DB2U_DEPLOYMENT>-<DEPLOYMENT_NAME_FOR_DB2U>
  • ii.DEPLOYMENT_NAME_FOR_DB2U comes from the name selected in the next section for deploying db2uinstance.yaml. For example, db2wh-mpp
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::${YOUR_ACCOUNT_ID}:oidc-provider/oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}:sub": "system:serviceaccount:${NAMESPACE_OF_DB2U_DEPLOYMENT}:${SA_OF_DB2U_DEPLOYMENT}",
                    "oidc.eks.us-east-2.amazonaws.com/id/${OIDC_PROVIDER_ID}:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

aws iam create-role --role-name Db2uS3AccessRole --assume-role-policy-document file://trust-policy.json

4. Attach the policy to the role

aws iam attach-role-policy --policy-arn arn:aws:iam::${YOUR_ACCOUNT_ID}:policy/Db2uS3AccessPolicy --role-name Db2uS3AccessRole

Deploying db2uinstance.yaml using AWS IAM Implicit Authentication in AWS EKS

The following is an example of the db2uinstance.yaml to create a Db2UInstance CR with Native Cloud Object Storage Enabled and using a storage class for caching tier over locally-attached NVMe drives:

apiVersion: db2u.databases.ibm.com/v1
kind: Db2uInstance
metadata:
  name: db2wh-mpp
spec:
  version: s11.5.9.0
  nodes: 2
  addOns:
    opendataformats:
      enabled: true
  advOpts:
    enableCos: "true"
    cosProvider: "aws"
  podTemplate:
    db2u:
      resource:
        db2u:
          limits:
            cpu: 14
            memory: 56Gi
  environment:
    dbType: db2wh
    databases:
      - name: BLUDB
    partitionConfig:
      total: 4
      volumePerPartition: true
    authentication:
      ldap:
        enabled: true
  license:
    accept: true
  storage:
    - name: meta
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 100Gi
        storageClassName: efs-test-sc
      type: create
    - name: archivelogs
      type: create
      spec:
        accessModes:
          - ReadWriteMany
        resources:
          requests:
            storage: 25Gi
        storageClassName: efs-test-sc
    - name: data
      type: template
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 300Gi
        storageClassName: ebs-sc
    - name: cachingtier
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 700Gi
        storageClassName: local-device
      type: template
    - name: tempts
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 50Gi
        storageClassName: ebs-sc
      type: template
    - name: etcd
      type: template
      spec:
        storageClassName: ebs-sc
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
    - name: blulocal 
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 70Gi
        storageClassName: ebs-sc
      type: template

Note: Immediately after applying the YAML above to create the Db2UInstance CR and before the db2u pods come up, edit the Service Account of the db2u deployment to add annotation for implicit authentication to work.

Use this command to edit the Service Account for the Db2U Deployment:

kubectl edit sa ${SA_OF_DB2U_DEPLOYMENT}

and add the following annotation for implicit authentication to work:

annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::${YOUR_ACCOUNT_ID}:role/Db2uS3AccessRole

This ensures that the admission controller of the AWS EKS will inject implicit authentication environment variables into the pods which will use this Service Account.

Completing the setup when using AWS IAM Implicit Authentication on AWS EKS

In the same way as before, the following example shows the execution of the db2-cos-objects.sh script. For AWS EKS Db2U deployment when using AWS IAM Implicit Authentication, the only required parameter is the bucket name:

kubectl exec -it c-db2wh-mpp-db2u-0 /bin/bash
su - db2inst1
cd /db2u/scripts
./db2-cos-objects.sh run-all --bucket ${BUCKET_NAME}

Troubleshooting the set up of Native COS table spaces

  • Entering invalid inputs, such as an incorrect bucket name, will result in db2-cos-objects.sh script failing with the following error:
Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Cataloging storage access with the following command:
 db2 CATALOG STORAGE ACCESS ALIAS whocwp8bucket VENDOR S3 SERVER https://9.30.272.59:9000 USER minioadddmin PASSWORD Nine8ddXML456$ CONTAINER whoceep8bucket OBJECT db2u DBUSER db2inst1
DB20000I  The CATALOG STORAGE ACCESS command completed successfully.

List command failed with return code, rc = 0x870F01B6
BUCKET whoceep8bucket is not empty, will not create the tablespace costbsp1 and the whocwp8bucket will be dropped...
Uncataloging storage access with the following command:
 db2 uncatalog storage access alias whocwp8bucket
DB20000I  The UNCATALOG STORAGE ACCESS command completed successfully.

In this case, the script returns an error before creating the storage group and terminates after dropping the storage access alias. Rerunning the script with the correct parameters should work.

  • If the object store endpoint is not accessible for any reason, db2-cos-objects.sh script will fail. Rerun the script and replace the sub command run-all with remove-all if the script fails after creating the storage group with the following error:
db2 CREATE TABLESPACE objstoretbsp USING STOGROUP stogroup
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0293N  Error accessing a table space container.  SQLSTATE=57048
ERROR: Failed to create tablespace

The following example shows how to run the script with the remove-all sub command. Running this command ensures that the environment is cleaned up properly as the provided parameter values for the existing alias and storage-group are removed.

./db2-cos-objects.sh remove-all --bucket <> --username <> --password <> --server <> --alias <> --storage-group <> --tablespace <> --folder <>

Following is the sample output if the script get executed successfully with the remove-all sub command:

Sleeping for 15 seconds for the reg var DB2_COS_TABLESPACES_ALLOWED_OPERATIONS=ALL to take effect...
Dropping temporary tablespace with the following command:
 db2 "drop tablespace OBJSTORESPACEUTMP1"
DB20000I  The SQL command completed successfully.
Dropping tablespace with the following command:
 db2 "drop tablespace costbsp1"
DB20000I  The SQL command completed successfully.
Dropping storage group with the following command:
 db2 "drop stogroup stogroup1"
DB20000I  The SQL command completed successfully.
Uncataloging storage access with the following command:
 db2 "uncatalog storage access alias entrepotbucket2"
DB20000I  The UNCATALOG STORAGE ACCESS command completed successfully.

After the environment is cleaned up, run the run-all sub command after the object store endpoint is accessible:

./db2-cos-objects.sh run-all --bucket <> --username <> --password <> --server <> --alias <> --storage-group <> --tablespace <> --folder <>

Limitations

  • Fresh installs only. No update or upgrade.

  • The user must use db2u-head-engn-svc service to always point to catalog node and not use db2u-engn-svc which can redirect to any engine pod.

  • No backup/restore

  • Default Db2 Warehouse configuration should not be altered.

  • Native COS supports 1 object store table space and 1 user temporary table space.

  • The object store bucket being used for Native COS must be empty and cannot be shared with datalake tables.

  • Supported operations for datalake tables include SELECT, CTAS and INSERT.

  • If a user creates a db2 table as db2inst1 user but does not specify a table space, then the table will be created on the default table space. USERSPACE1 table space spans across all nodes. It is recommended to explicitly use USERSPACE1 or a table space that spans all nodes when creating db2 tables.

For more information on restrictions and limitations for datalake tables, see https://www.ibm.com/docs/en/db2woc?topic=tables-restrictions-limitations

For more information on restrictions and limitations for Native COS table spaces, see https://www.ibm.com/docs/en/db2woc?topic=support-restrictions-limitations

[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCJDQ","label":"IBM Db2 Warehouse"},"ARM Category":[{"code":"a8m3p000000hB51AAE","label":"Warehouse"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
31 August 2023

UID

ibm17028563