IBM Support

Tutorial: Adding a database instance to an AWS EKS cluster, using the Db2 Operator

General Page

This tutorial walks you through a database deployment, using the Db2 Operator, on the Amazon Web Services (AWS) cloud. There are many configuration options available to you, based on your size and performance requirements. In this tutorial, you'll be provided with options to create single-partition instances of either Db2 or Db2 Warehouse, or a multi-partition Db2 Warehouse instance.

Objective

Completing the tutorial gives you a working database instance on the AWS cloud. Steps include:
  • Setting up an AWS account.
  • Creating an Amazon EKS cluster.
  • Adding nodes to your cluster.
  • Setting up block storage for your cluster.
  • Setting up shared file storage for your cluster.
  • Deploying a database instance to your EKS cluster.

Overview of a Db2/Db2 Warehouse on AWS configuration

Deploying a Db2 or Db2 Warehouse database on AWS can be summarized as follows:

  1. Your system administrator creates an AWS account and chooses a managed service platform for your database instance. You can run your database on the following AWS managed services:
    • Db2 on the Red Hat OpenShift Service on AWS (ROSA).
    • Amazon Elastic Kubernetes Service (EKS).
  2. Using a series of command line tools, your administrator creates an AWS cluster based on your specific requirements. An EC2 instance is selected and a file system and storage class are created.
  3. Your administrator then runs the Db2 Operator and deploys your database instance to your AWS cluster.

Once your database instance is deployed, users can connect to a database in the instance in much the same way as they connect to an on-premise data source.

The flowchart below shows how users interact with the database instance when it is hosted on AWS:

Db2 and Db2 Warehouse on AWS

Choosing an Amazon EC2 instance type

Use the information in the following tables to determine the size requirements of your Db2 or Db2 Warehouse instance. Based on your selection, go to the list of Amazon EC2 instance types to find the AWS instance type that's right for your database deployment. In addition to the sizing values shown in the tables, the EC2 instance type list shows other factors to consider, such as cost and region.

For OTLP workloads, choose an instance from the General Purpose or Storage Optimized categories.

Table 1 provides sizing guidelines for small, medium, and large Db2 OLTP systems.
 

Table 1. Sizing guidelines for Db2 OLTP on AWS systems

Size CPUs Memory (GB) AWS Instance Type
(Suggested)
Description
Small 8 24
  • m5.xlarge
  • m5a.xlarge
  • m5n.xlarge
  • m5zn.xlarge
  • m4.xlarge
  • Entry level, departmental OLTP workloads.
  • Five concurrent connections.
  • 500 GB of data and logs.
Medium 16 128
  • r5.8xlarge
  • r5a.8xlarge
  • r5b.8xlarge
  • r5n.8xlarge
  • r4.8xlarge
  • Mid-range, line of business OLTP workloads.
  • 1.4 TB of data and logs.
Large 49 1001
  • r6g.16xlarge
  • r6i.16xlarge
  • r5.16xlarge
  • r5a.16xlarge
  • r5b.16xlarge
  • r5n.16xlarge
  • High end Enterprise OLTP workloads.
  • 11 TB of data and logs.

For Db2 Warehouse workloads on single-partition or multi-partition environments, choose an instance from the Memory Optimized category.

Table 2 provides sizing guidelines for small, medium, and large single-partition Db2 Warehouse on AWS systems.

Table 2. Sizing guidelines for a single-partition Db2 Warehouse on AWS system

Size CPUs Memory (GB) AWS Instance Type
(Suggested)
Description
Small 7 98
  • r6g.4xlarge
  • r6i.4xlarge,
  • r5.4xlarge
  • r5a.4xlarge
  • r5b.4xlarge
  • r5n.4xlarge
  • 2 TB uncompressed data.
  • 500 GB storage.
Medium 15 226
  • r6g.8xlarge
  • r6i.8xlarge
  • r5.8xlarge
  • r5a.8xlarge
  • r5b.8xlarge
  • r5n.8xlarge
  • r4.8xlarge
  • 4 TB uncompressed data.
  • 1 TB storage.
Large 31 482
  • r6g.16xlarge
  • r6i.16xlarge
  • r5.16xlarge
  • r5a.16xlarge
  • r5b.16xlarge
  • r5n.16xlarge
  • 8 TB uncompressed data.
  • 2 TB storage.
For more information on single-partition Db2 Warehouse environments, see Single database partition with multiple processors .
 

Table 3 provides sizing guidelines for small, medium, and large multi-partition Db2 Warehouse on AWS systems.

Table 3. Sizing guidelines for a multi-partition Db2 Warehouse on AWS system

Size CPUs Memory (GB) AWS Instance Type
(Suggested)
Description
Small 39 610
  • r6g.4xlarge
  • r6i.4xlarge
  • r5.4xlarge
  • r5a.4xlarge
  • r5b.4xlarge
  • r5n.4xlarge
  • 20 TB of uncompressed data. This sizing estimate is for the entire Db2 deployment (all Db2 Warehouse database partitions).
  • 4 to 32 vCPUs per Db2 Warehouse database partition.
  • Use a memory (in GB) to vCPU ratio of 8:1 to 32:1, with 16:1 or higher being the optimal range.
  • Note: The optimal number of Db2 Warehouse database partitions per OpenShift worker depends on the memory-to-core ratio as well as the ability to scale-out or provide fail-over.
Medium 77 1201
  • r6g.8xlarge
  • r6i.8xlarge
  • r5.8xlarge
  • r5a.8xlarge
  • r5b.8xlarge
  • r5n.8xlarge
  • r4.8xlarge
40 TB of uncompressed data.
Large 153 2406
  • r6g.16xlarge
  • r6i.16xlarge
  • r5.16xlarge
  • r5a.16xlarge
  • r5b.16xlarge
  • r5n.16xlarge
80 TB of uncompressed data.
For more information on multi-partition Db2 environments, see Database partitions with one processor.

Choosing cloud storage

When choosing cloud storage options for your Db2 on AWS configuration, consider the following points:
  • For database storage, log storage, and use of temporary table spaces, use a block storage solution.
  • For metadata storage and backup storage, use a shared file storage solution.

Amazon provides block storage (EBS) and shared file storage (EKS) options for your Db2 deployment. The following diagram shows how storage is distributed in a single-partition Db2 formation:

Db2 OLTP cluster formation detail

 The following diagram show how storage is distributed in a Db2 Warehouse formation:
Db2 Warehouse formation
 

Environment

Before you start the configuration, you need to set some variables locally that are used for this tutorial. These variables are:
  • ACCOUNT_ID: Your 12-digit AWS account ID (for example, ACCOUNT_ID="001234567890").
  • CLUSTER: The name you use for your EKS Cluster. Use eks-db2-demo.
  • REGION: The region where your AWS instances are being deployed. Use us-east-2.
  • AZ: The availability zone.  Because EBS (data) volumes cannot move across availability zones, you must ensure that your database instance is running in one AZ only. Use us-east-2a.
  • VERSION:  The Kubernetes version. Use 1.21.
  • INSTANCE_TYPE: The kind of instance to select for deploying workloads:
    • For single-partition Db2 instances, use m5.xlarge.
    • For single-partition Db2 Warehouse instances, use r5.8xlarge.
    • For multi-partition Db2 Warehouse instances, use r5.16xlarge.
  • MIN_NODES: The minimum number of instances for the EKS cluster:
    • For single-partition Db2 instances, use 1.
    • For single-partition Db2 Warehouse instances, use 1.
    • For multi-partition Db2 Warehouse instances, use 6.
  • MAX_NODES: The maximum number of instances for the EKS cluster.
    • For single-partition Db2 instances, use 1.
    • For For single-partition Db2 Warehouse instances, use 1.
    • For multi-partition Db2 Warehouse instances, use 6.
  • NAMESPACE: The namespace where your database instance will be deployed. Use db2u.
  • NODE_GROUP: The Node Group for creating instances. Use db2-dev-nodes.
  • EFS_SG: The Security Group name for EFS. Use EFSDb2SecurityGroup.
  • EBS_IAM_ROLE: The IAM Role name for EBS. Use AmazonEKS_EBS_CSI_DriverRole.
 
You also need to install three command-line tools locally. These are required to complete the tutorial:
  • AWS CLI: An open-source tool for communicating with an AWS service, directly from your OS command line, or from a remote terminal program.  This tool requires some post-install configuration.
  • kubectl: The Amazon flavor of the Kubernetes command-line utility that is used to communicate with the cluster API server.
  • eksctl : A command-line utility for creating and managing Kubernetes clusters on Amazon EKS.

Steps

Create an AWS account

Before you begin to create your AWS cluster, you need to have an AWS account.
 
  1. From a web browser, go to https://portal.aws.amazon.com/billing/signup .
  2. Follow the online instructions.
    NOTE: You will be contacted by phone and be required to enter a verification code on you phone keypad.

Create an Amazon EKS cluster

You use the eksctl utility to create an Amazon EKS cluster for your Db2® deployment. You can also use the utility to define the node type properties for your cluster.

Procedure
  1. Run the following command to create an EKS cluster without adding any nodes:
    eksctl create cluster --version=${VERSION} --name=${CLUSTER} --without-nodegroup --region=${REGION} --asg-access --with-oidc
    where
    • version is the Amazon EKS version. For example, 1.21.
    • name is the name you give to your cluster.
    • without-nodegroup tells the eksctl utility to not create a node group.
    • region identifies the location of your EKS server. For example, us-east-2.
    • with-oidc tells the eksctl utility to add an external identity provider service for sign-in that supports the OpenID Connect (OIDC) standard.
  2. Get the vpc_id of your cluster:
    vpc_id=$(aws eks describe-cluster --name ${CLUSTER} --query "cluster.resourcesVpcConfig.vpcId" --output text)
  3. Get the subnet ids for one availability zone in the same vpc as the EKS cluster:
    subnet_ids=$(aws ec2 describe-subnets --filters "Name=vpc-id,Values=$vpc_id" "Name=availability-zone,Values=$AZ" --query 'Subnets[*].SubnetId' --output text | sed  -e 's/[[:space:]]/,/')
  4. Add a nodegroup for your worker nodes:
    eksctl create nodegroup --cluster=${CLUSTER} --region=${REGION} --managed --spot --name=${NODE_GROUP} --instance-types=${INSTANCE_TYPE} --nodes-min=${MIN_NODES} --nodes-max=${MAX_NODES} --subnet-ids ${subnet_ids} --asg-access
  5. Validate the added nodes:
    kubectl get nodes
 

Configure your Amazon EKS cluster for block storage using the EBS CSI driver

Block storage is the best option for your Db2 data. To make use of block storage, you create an Amazon Elastic Block Store (EBS) storage class for your EKS cluster. Creating block storage involves the following steps:

  • Creating a role and attaching a policy for setting up EBS.
  • Installing the EKS managed add-on for the EBS CSI.
  • Creating a storage class for your EBS volume.

Before you begin

The EBS CSI Driver for your EKS cluster can be installed either directly or as an EKS managed add-on. The steps in the following procedures explain how to install the EKS managed add-on.
Before you begin configuring your EKS cluster for block storage, ensure that you have the following information:

  • type: The EBS volume type.
  • iopsPerGB: The number of I/O operations per second, per GB.
  • An IAM OIDC provider. You can check whether a URL exists for your IAM OIDC provider by running the following command:
    url=$(aws eks describe-cluster --name ${CLUSTER} --query "cluster.identity.oidc.issuer" --output text | sed 's|.*/||')
    The following command creates an OIDC provider for your cluster if it doesn't exist:
    aws iam list-open-id-connect-providers | grep ${url} || eksctl utils associate-iam-oidc-provider --cluster ${CLUSTER} --approve

Procedure

  1. Create a role and attach a policy for configuring the CSI driver.

    You set up a storage class for EBS by using the appropriate Container Storage Interface (CSI) driver. To use the CSI driver, you first need to create a role for your EKS cluster, and then download and attach an AWS policy to the role. This policy provides the credentials that you need to use the CSI driver.
    Run the following command:

    eksctl create iamserviceaccount --name ebs-csi-controller-sa --namespace kube-system --cluster ${CLUSTER} --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy --approve --role-only --role-name ${EBS_IAM_ROLE}
    where
    • cluster is the name of your EKS cluster.
    • namespace is the name of the Kubernetes namespace that you want to associate with the role.
    • name is the name you want to assign to the role.
  2. Install the EKS managed add-on for the EBS CSI.
    eksctl create addon --name aws-ebs-csi-driver --cluster ${CLUSTER} --service-account-role-arn arn:aws:iam::${ACCOUNT_ID}:role/${EBS_IAM_ROLE} --force
  3. Create a storage class for your EBS file system.
    Run the following code to create a io2 storage class:
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: ebs-sc
    parameters:
      type: io2 
      iopsPerGB: "500" #this field is required for io1 and io2 type storage classes. This can be calulated based on your volume size and maximum allowable iops. Details: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html
    provisioner: ebs.csi.aws.com
    reclaimPolicy: Delete
    volumeBindingMode: WaitForFirstConsumer
 

Configure your Amazon EKS cluster for shared file storage

Shared file storage is the best option for your database metadata and backups. To make use of shared file storage, you create an Amazon Elastic File System (EFS) storage class for your EKS cluster. Creating shared file storage involves the following steps:

  • Creating a role and attaching a policy for setting up EFS.
  • Installing the AWS EFS CSI Driver, using a manifest.
  • Creating an AWS EFS File System for your EKS cluster.
  • Creating a storage class for your EFS file system.

Procedure

  1. Create a role and attach a policy for configuring the EFS CSI driver.
    1. Download the IAM policy:
      curl -o iam-policy-example.json https://raw.githubusercontent.com/kubernetes-sigs/aws-efs-csi-driver/v1.3.7/docs/iam-policy-example.json
    2. Install the IAM Policy:
      aws iam create-policy --policy-name AmazonEKS_EFS_CSI_Driver_Policy --policy-document file://iam-policy-example.json
    3. Create the IAM role and attach the IAM policy to the role:
      eksctl create iamserviceaccount --cluster ${CLUSTER} --namespace kube-system --name efs-csi-controller-sa --attach-policy-arn arn:aws:iam::${ACCOUNT_ID}:policy/AmazonEKS_EFS_CSI_Driver_Policy --approve --region ${REGION}
      where
      • cluster is the name of your EKS cluster.
      • namespace is the Kubernetes namespace that you want to associate with the role.
      • name is the name you want to assign to the role.
      • region is where your AWS instance is being deployed.
  2. Install the AWS EFS CSI Driver, using a manifest.
    To run applications on an Amazon EKS cluster that is configured for shared file storage (EFS), you need to create and mount an EFS file system on your cluster. To mount the file system, you need to download and install the Amazon EFS Container Storage Interface (CSI) driver.
    1. Download the manifest:
      kubectl kustomize "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=release-1.3" > public-ecr-driver.yaml 
    2. Edit the public-ecr-driver.yaml file to remove the efs-csi-controller-sa ServiceAccount. This service account was created previously.
    3. Install the CSI driver:
      kubectl apply -f public-ecr-driver.yaml
      This command also creates pods in the kube-system namespace.
    4. Check that the pods are created in the namespace:
      kubectl get pods -n kube-system
  3. Create an AWS EFS File System for your EKS cluster.
    1. Retrieve the virtual private cloud (VPC) ID of your EKS cluster:
      vpc_id=$(aws eks describe-cluster --name ${CLUSTER} --query "cluster.resourcesVpcConfig.vpcId" --output text)
    2. Retrieve the subnetting (CIDR) range for the cluster's VPC:
      cidr_range=$(aws ec2 describe-vpcs --vpc-ids ${vpc_id} --query "Vpcs[].CidrBlock" --output text)
    3. Create a security group using the vpc_id:
      security_group_id=$(aws ec2 create-security-group --group-name ${EFS_SG} --description "My EFS security group" --vpc-id $vpc_id --output text)
    4. Create an inbound rule that allows inbound nfs traffic from the CIDR for your cluster's vpc:
      aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol tcp --port 2049 --cidr $cidr_range
    5. Create an AWS EFS file system for your EKS cluster in the same region as your EKS cluster:
      file_system_id=$(aws efs create-file-system --region ${REGION} --performance-mode generalPurpose --query 'FileSystemId' --output text)
    6. Create the mount targets:
      1. Determine the subnet IDs in your VPC of your EKS cluster:
        TAG="tag:alpha.eksctl.io/cluster-name"
        eks_subnet_ids=$(aws ec2 describe-subnets --filters "Name=vpc-id,Values=${vpc_id}" "Name=${TAG},Values=${CLUSTER}" --query 'Subnets[*].SubnetId' --output text)
      2. Run the following code to create your mount targets for each of the subnets in your EKS Cluster:
        for subnet in ${eks_subnet_ids}; do
            aws efs create-mount-target --file-system-id ${file_system_id} --security-groups ${security_group_id} --subnet-id ${subnet}
        done
  4. Create a storage class for your EFS file system.
    Ensure that you have the following information before running the command to create your EFS storage class:
    • provisioningMode: The EFS access point mode. Use efs-ap.
    • fileSystemId: The EFS file system ID. For example, - fs-08a5b4467d198bf3e.
    • uid: The access point user ID. Use zero (0).
    • gid: The access point group ID. Use zero (0).
    • directoryPerms: The directory permissions for the access point root directory. Use 777.
    Run the following command to create the storage class for your EFS file system. Include the ID of the AWS EFS File System that you created for your EKS cluster:
    cat << EOF | kubectl create -f -
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: efs-test-sc
    parameters:
      directoryPerms: "777"
      fileSystemId: ${file_system_id}
      gid: "0"
      provisioningMode: efs-ap
      uid: "0"
    provisioner: efs.csi.aws.com
    reclaimPolicy: Delete
    volumeBindingMode: Immediate
    EOF
 

Installing OLM to Deploy Db2u Operator

OLM is a component of the Operator Framework, an open source toolkit to manage Kubernetes native applications, called Operators, in an effective, automated, and scalable way. OLM extends Kubernetes to provide a declarative way to install, manage, and upgrade Operators and their dependencies in a cluster. You must install the OLM to run the Db2 Operator.

  1. Create the OLM namespace:
    kubectl create namespace olm
  2. Install the OLM:
    curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/v0.20.0/install.sh | bash -s v0.20.0
  3. Check for pods in the olm namespace:
    kubectl get pods -n olm

Deploy a database instance on your Amazon EKS cluster

When you have finished creating your Amazon EKS cluster, and configured your cloud storage option, you can deploy one of the following instances to the cluster, using the Db2 Operator:

  • A single-partition instance of Db2.
  • A single-partition instance of Db2 Warehouse.
  • A multi-partition instance of Db2 Warehouse.

When you log in to your EKS cluster you will need to complete the following tasks:

  • Create the namespace for the Db2 operator.
  • Create a CatalogSource object in the olm namespace to install the Db2 operator.
  • Deploy the Db2 operator in the namespace.
For information on how to modify your deployment, see Deploying Db2 using the Db2uCluster custom resource.
 

Procedure

  1. Log in to your Amazon EKS cluster.
  2. Install the ibm-db2uoperator-catalog in the namespace where olm is installed:
    cat << EOF | kubectl create -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: CatalogSource
    metadata:
      name: ibm-db2uoperator-catalog
      namespace: olm
    spec:
      sourceType: grpc
      image: icr.io/cpopen/ibm-db2uoperator-catalog@sha256:8a5ab72a72f7bd42b7874b687f20ac01cb82b0f05ca24e56d5a21b1413c8ef09
      displayName: IBM Db2 Catalog
      publisher: IBM
      updateStrategy:
        registryPoll:
          interval: 45m
    EOF
  3. Create a namespace for installing the Db2 operator:
    kubectl create namespace ${NAMESPACE}
  4. Install an operator group in the namespace:
    cat << EOF | kubectl create -f -
    apiVersion: operators.coreos.com/v1
    kind: OperatorGroup
    metadata:
      name: db2u-operator-group
      namespace: ${NAMESPACE}
    spec:
      targetNamespaces:
      - ${NAMESPACE}
    EOF
  5. Create a subscription in the namespace to deploy the Db2 operator:
    cat << EOF | kubectl create -f -
    apiVersion: operators.coreos.com/v1alpha1
    kind: Subscription
    metadata:
      name: ibm-db2uoperator-catalog-subscription
      namespace: ${NAMESPACE}
      generation: 1
    spec:
      channel: v2.0
      name: db2u-operator
      installPlanApproval: Automatic
      source: ibm-db2uoperator-catalog
      sourceNamespace: olm
      startingCSV: db2u-operator.v2.0.0
    EOF
  6. Check that the db2u-operator pod is deployed:
    kubectl get pods -n ${NAMESPACE} | grep db2u-operator
  7. After the db2u-operator pod is up and running, run the YAML code to deploy your database instance.
    Choose from the following: Parameters common to this code include:
    Field Values
    (with examples)
    Description
    metadata.name db2oltp-test The name of the Db2uCluster CR
    metadata.namespace db2u The namespace where the database instance will be deployed
    .spec.size 1 The number of Db2 nodes.
    For single-partition Db2 and Db2 Warehouse instances, the value is 1; For multi-partition Db2 Warehouse instances, the value can be 2 or greater.
    .spec.environment.database.name BLUDB The database name for the instance
    .spec.environment.dbType db2oltp Accepted values: db2wh, db2oltp
    .spec.environment.ldap.enabled false To enable LDAP, set this to true
    .spec.license.accept true This is a required value that must be set to true
    .spec.podConfig.resource.db2u.limits.cpu "2" The CPU limits for the db2u engine pods. Limits and request are set to the same value (which is the expected value).
    .spec.podConfig.resource.db2u.limits.memory 8Gi The memory limits for the db2u engine pods. Limits and request are set to same value (which is the expected value).
    .spec.version 11.5.7.0-cn5 The Db2u version that the operator supports. The value in the left column is the latest release.
    .spec.storage [] An array of storage configurations. This is the required storage configuration for meta and data (or shared).
  8. Check the status of your Db2uCluster:
    kubectl get db2ucluster -n ${NAMESPACE} ${DB2U_CLUSTER_NAME}
    where DB2U_CLUSTER_NAME is the name value set in the metadata section of the YAML code. For example,
    name: db2oltp-test
    NOTE: You can define an alternate value for name by using the db2ucluster custom resource.
  9. When the STATE value returned is Ready, the instance is deployed successfully.
    ${DB2U_CLUSTER_NAME}
    NAME                        STATE   MAINTENANCESTATE          AGE
    db2-001234567890          Ready         None               6h8m
  10. Log on to the database engine pod as db2inst1:
    kubectl -n ${NAMESPACE} exec -it $(oc get pods -n ${NAMESPACE} | grep ${DB2U_CLUSTER_NAME}-db2u-0 | awk '{print $1}') -- su - db2inst1
  11. Connect to the database bludb on your AWS cluster:
    db2 connect to bludb
    NOTE: While we are using the database name bludb for this tutorial, you can change this name by using the db2ucluster custom resource.

[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSCJDQ","label":"IBM Db2 Warehouse"},"ARM Category":[{"code":"a8m500000008PknAAE","label":"Install\/Migrate\/Upgrade"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
05 August 2022

UID

ibm16600071