Getting the cluster installed and running in a GDPC environment

There are procedures to be followed in order to get the geographically dispersed Db2® pureScale® cluster (GDPC) installed, and up and running.

Before you begin

Ensure that you have the three sites setup with the proper hardware configurations. Read Configuring a GDPC environment for details regarding the hardware configurations used and referenced in this topic.

For a cluster using TCP/IP network for communication between members and CFs, if multiple adapter ports are assigned to each member or CF, ensure those network interfaces are bonded so that only the bonded interface is specified as the NETNAME column in the db2instance -list output. All NETNAME listed in the output should be in the same IP subnet. The single IP subnet is mandatory to setup IBM Spectrum Scale in the following setup procedure.

Procedure

  1. Install the Db2 pureScale Feature on two sites.
    Install the Db2 pureScale Feature on two sites using the db2setup command (for example, site A and site B). Using the Advanced Configuration menu, designate two hosts as the CFs and (optionally) one of the two to be the preferred primary CF. In the example, the hosts are hostA1, hostA2, hostB1, and hostB2.

    On site A, designate hostA1, hostA2, hostB1, and hostB2 as members where hostB1 is the shared disk member and hostB2 is the tiebreaker member. During install the tiebreaker disk must be set up using one of the LUNs. This is temporary and can be changed later. For the following, an option is to use hdiskA2.

    The file system that the db2setup command creates for the shared instance metadata is initially a non-replicated IBM Spectrum Scale file system. This is converted later to a replicated file system across the sites.

  2. Updating majority quorum and SCSI-3 PR settings
    1. The tiebreaker setting might need to be updated to use Majority Node Set. Query the current tiebreaker device using the following command:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -list 
      -tiebreaker
    2. If the output from the last step does not specify ‘Majority Node Set’ as the quorum device, it must be updated as follows:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -set -tiebreaker –majority
      Configuring quorum device for domain 'db2domain_20110224005525' ...
      Configuring quorum device for domain 'db2domain_20110224005525' was successful.
    3. After updating the tiebreaker device, verify the setting and compare it to the expected output:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -list 
      -tiebreaker
      The current quorum device is of type Majority Node Set.
    4. Check to see if SCSI-3 PR is enabled. In the sample output, pr=yes indicates SCSI-3 PR is enabled:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmlsnsd –X
      
      Disk name NSD volume ID Device Devtype Node name Remarks
      --------------------------------------------------------
      gpfs1nsd 091A33584D65F2F6 /dev/hdiskA1 hdisk hostA1 pr=yes
    5. If your disks do not support SCSI-3 PR or you choose to disable it, run these commands:
      root@hostA1:/opt/IBM/db2/V11.5/bin> su – db2inst1
      db2inst1@hostA1:/home/db2inst1> db2stop force
      02/24/2011 01:24:16 0 0 SQL1064N DB2STOP processing was successful.
      02/24/2011 01:24:19 1 0 SQL1064N DB2STOP processing was successful.
      02/24/2011 01:24:21 3 0 SQL1064N DB2STOP processing was successful.
      02/24/2011 01:24:22 2 0 SQL1064N DB2STOP processing was successful.
      SQL1064N DB2STOP processing was successful.
      db2inst1@hostA1:/home/db2inst1> exit
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -stop –all
      All specified hosts have been stopped successfully.
    6. Verify that IBM Spectrum Scale is stopped on all hosts:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmgetstate -a
      Node number Node name GPFS state
      ------------------------------------------
      1 		hostA1 	down
      2 		hostA2 	down
      3 		hostA3 	down
      4 		hostB1 	down
      5 		hostB2 	down
      6 		hostB3 	down
      Disable SCSI-3 PR by issuing this command:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmchconfig usePersistentReserve=no
      Verifying GPFS is stopped on all nodes ...
      mmchconfig: Processing the disks on node hostA1.torolab.ibm.com
      mmchconfig: Processing the disks on node hostA2.torolab.ibm.com
      mmchconfig: Processing the disks on node hostA3.torolab.ibm.com
      mmchconfig: Processing the disks on node hostB1.torolab.ibm.com
      mmchconfig: Processing the disks on node hostB2.torolab.ibm.com
      mmchconfig: Processing the disks on node hostB3.torolab.ibm.com
      mmchconfig: Command successfully completed
      mmchconfig: Propagating the cluster configuration data to all affected nodes. This 
      is an asynchronous process.
    7. Verify that SCSI-3 PR has been disabled (pr=yes is not displayed):
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmlsnsd -X
      Disk name NSD volume ID Device Devtype Node name Remarks
      --------------------------------------------------------
      gpfs1nsd 091A33584D65F2F6 /dev/hdiskA1 hdisk hostA1
    8. Verify that usePersistentReserve has been set to no
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmlsconfig
      Configuration data for cluster db2cluster_20110224005554.torolab.ibm.com:
      -----------------------------------------------------------
      clusterName db2cluster_20110224005554.torolab.ibm.com
      clusterId 655893150084494058
      autoload no
      minReleaseLevel 5.0.2.0
      dmapiFileHandleSize 32
      maxFilesToCache 15000
      pagepool 2G
      verifyGpfsReady yes
      assertOnStructureError yes
      workerThreads 512
      sharedMemLimit 2047M
      usePersistentReserve no
      failureDetectionTime 48
      leaseRecoveryWait 35
      tiebreakerDisks gpfs1nsd
      [hostA1]
      psspVsd no
      adminMode allToAll
      File systems in cluster db2cluster_20110224005554.torolab.ibm.com:
      ------------------------------------------------------------------
      /dev/db2fs1
  3. Increase HostFailureDetectionTime for increased communication between sites.
    HostFailureDetectionTime is increased to a higher value than what would be set on a non-GDPC Db2 pureScale cluster. Changing this value allows for the increased communication lag between sites that is not present in a single-site Db2 pureScale cluster. If unexpected host down events are still triggered due to large inter-site distances, higher parameter values can be used, however this will increase the time required to detect hardware failures or machine reboots, increasing the overall failure recovery time.
    root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -list 
    -hostfailuredetectiontime
    The host failure detection time is 4 seconds.
    Change the value to 16 seconds and verify.
    root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -set -option hostfailuredetectiontime -value 16 -force
    The host failure detection time has been set to 16 seconds.
    
    root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cm -list 
    -hostfailuredetectiontime
    The host failure detection time is 16 seconds.
  4. Add tiebreaker host into cluster to provide cluster quorum.
    The tiebreaker host provides cluster quorum, ensuring that during normal operation, the cluster contains an odd number of hosts. In case of a network outage between sites, only the site which can communicate with the tiebreaker host gains cluster quorum. In the following example, the tiebreaker host is Host T on site C.
    1. Install Db2 software on the tiebreaker host:
      Note: Install Db2 in the same location on all nodes for the operation to be completed successfully.
      root@T:/path containing db2_install. /db2_install -p SERVER -f PURESCALE -b /opt/ibm/db2/V11.5
      
      
      DBI1324W  Support of the db2_install command is deprecated.
      Total number of tasks to be performed: 53
      Total estimated time for all tasks to be performed: 1671 second(s)
      
      Task #1 start
      ...
      
      Task #53 end
      
      The execution completed successfully.
      
      For more information see the Db2 installation log at
      "/tmp/db2_install.log.28622"
    2. Set up SSH for the db2sshid user on the tiebreaker host T. This user should be the same db2sshid user set during the installation on site A and site B. To check what user was used, run the following command on hostA:
      root@hostA1>/var/db2/db2ssh/db2locssh display_config
      
      version = 1
      time_delta = 20 second(s)
      debug_level = 2
      db2sshid = db2inst1
      gdkit_path = /opt/IBM/db2/V11.5/lib64/gskit/
      fips_mode = on
      Note: If the db2sshid user that was used on hostA is not available on hostT, you must create the db2sshid user on hostT. For more information, see Creating required users for a DB2® pureScale Feature installation..
    3. Set up db2ssh on tiebreaker hostT. The following commands must be run as root:
      • Create the configuration file:
        /var/db2/db2ssh/db2locssh reset_config
      • Set the GSKit path:
        /var/db2/db2ssh/db2locssh set_gskit_path /opt/IBM/db2/V11.5/lib64/gskit/
      • Set db2sshid (db2sshid is determined from the previous step):
        /var/db2/db2ssh/db2locssh set_db2sshid db2inst1
      • Verify the setting:
        root@T>/var/db2/db2ssh/db2locssh display_config
        
        version = 1
        time_delta = 20 second(s)
        debug_level = 2
        db2sshid = db2inst1
        gdkit_path = /opt/IBM/db2/V11.5/lib64/gskit/
        fips_mode = on
      • Generate a private/public key pair:
        /var/db2/db2ssh/db2locssh generate_keys
    4. Perform key exchanges with every host in the cluster. Once the key exchange is completed, the /var/db2/db2ssh directory looks as shown:
      hostA1:
      root@hostA1.priv
      root@hostA1.pub
      root@hostA2.pub
      root@hostA3.pub
      root@hostB1.pub
      root@hostB2.pub
      root@hostB3.pub
      root@T.pub
      
      hostB1:
      root@hostB1.priv
      root@hostB1.pub
      root@hostB2.pub
      root@hostB3.pub
      root@hostA1.pub
      root@hostA2.pub
      root@hostA3.pub
      root@T.pub
      
      T:
      root@T.priv
      root@T.pub
      root@hostA1.pub
      root@hostA2.pub
      root@hostA3.pub
      root@hostB1.pub
      root@hostB2.pub
      root@hostB3.pub
    5. Ensure that OpenSSH is available on your system.
    6. Create a $HOME/.ssh directory for the SSH user (db2sshid):
      mkdir $HOME/.ssh
      chmod 700 $HOME/.ssh
    7. Generate a public key/private key pair:
      cd $HOME/.ssh
      ssh-keygen -t dsa
    8. As an SSH user (db2sshid), create a file that is called authorized_keys under $HOME/.ssh where, $HOME is the home directory of db2sshid. Append the contents of each public key id_dsa.pub from each host to the authorized_keys file.
    9. Copy the authorized_keys file to the $HOME/.ssh directory on each host where, $HOME is the home directory of db2sshid.
    10. Run the chmod 644 authorized_keys command to change the permission of authorized keys on all the hosts.
    11. Log in to each host as an SSH user (db2sshid) and SSH to all the hosts to confirm whether you are able to communicate across all the hosts without a password prompt.
      • On hostA as an SSH user (db2sshid):

        ssh <hostA>

        ssh <hostB>

      • On hostB as an SSH user (db2sshid):

        ssh <hostA>

        ssh <hostB>

    12. Set up the host key file. The following commands must be executed from the tiebreaker host to every other host, as well as from every other host to the tiebreaker host. When asked to save the host key file fingerprint, answer Yes:
      root@T>/var/db2/db2ssh/db2locssh root@hostA1 hostname
      hostA1
      root@T>/var/db2/db2ssh/db2locssh root@hostB1 hostname
      hostB1
      root@T>/var/db2/db2ssh/db2locssh root@hostT hostname
      hostT
      
      root@hostA1>/var/db2/db2ssh/db2locssh root@T hostname
      T
      root@hostB1>/var/db2/db2ssh/db2locssh root@T hostname
      T
    13. Change the IBM Spectrum Scale quorum type for the cluster to majority node set and verify:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -set -tiebreaker –majority
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -list 
      -tiebreaker
      The current quorum device is of type Majority Node Set.
    14. Add the tiebreaker host to RSCT and IBM Spectrum Scale cluster:
      db2cluster -add -host T -no_san_access
    15. On the tiebreaker host add the IBM Spectrum Scale license:
      root@T:/opt/IBM/db2/V11.5/bin> ./db2cluster -cfs -add –license
      
      The license for the shared file system cluster has been successfully added.
    16. Verify the license warning message is gone:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmlsnode
      
      GPFS nodeset Node list
      ------------- --------------------------------------------------
      db2cluster_20110224005554 hostA1 hostA2 hostA3 hostB1 hostB2 hostB3 T
    17. Verify the following parameter are set to the expected values using the mmlsconfig command:
      • verifyGpfsReady is set to yes
      • maxFilesToCache is set to 15000
      • unmountOnDiskFail is set to yes for the tiebreaker host.
      An example mmlsconfig command output:
      Note: [T] is the tiebreaker host. Configuration options listed beneath [T] apply only to the tiebreaker host.
      root@hostA1:/> /usr/lpp/mmfs/bin/mmlsconfig
      Configuration data for cluster db2cluster_20180925122717.torolab.ibm.com:
      -------------------------------------------------------------------------
      clusterName db2cluster_20180925122717.torolab.ibm.com
      clusterId 18045566345657606634
      autoload no
      dmapiFileHandleSize 32
      ccrEnabled yes
      sharedMemLimit 2047M
      failureDetectionTime 48
      leaseRecoveryWait 35
      totalPingTimeout 45
      subnets 10.7.2.0
      usePersistentReserve yes
      verifyGpfsReady yes
      cipherList AUTHONLY
      workerThreads 512
      maxFilesToCache 15000
      maxblocksize 1M
      minReleaseLevel 5.0.2.0
      pagepool 2G
      [T]
      unmountOnDiskFail yes
      [common]
      adminMode allToAll
      
      File systems in cluster db2cluster_20180925122717.torolab.ibm.com:
      ------------------------------------------------------------------
      /dev/datafs
      /dev/db2fs1
      /dev/logfs
      If any of the values need to be changed, use the mmchconfig command to update the values as root. For example, to change all 3 configuration parameters run the following commands.
      Note: T is the tiebreaker host’s hostname.
      root@hostA1> /usr/lpp/mmfs/bin/mmchconfig unmountOnDiskFail=no 
      mmchconfig: Command successfully completed
      root@hostA1> /usr/lpp/mmfs/bin/mmchconfig unmountOnDiskFail=yes -N T
      mmchconfig: Command successfully completed
      root@hostA1> /usr/lpp/mmfs/bin/mmchconfig maxFilesToCache=15000
      mmchconfig: Command successfully completed
      root@hostA1> /usr/lpp/mmfs/bin/mmchconfig verifyGpfsReady=yes
      mmchconfig: Command successfully completed
    18. To enable IBM Spectrum Scale to respond to failures faster, update the minMissedPingTimeout and maxMissedPingTimeout parameters:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -set -option minMissedPingTimeout -value 70
      The shared file system cluster parameter 'minMissedPingTimeout' has been successfully updated to '70'.
      
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -set -option maxMissedPingTimeout -value 80
      The shared file system cluster parameter 'maxMissedPingTimeout' has been successfully updated to '80'.
    19. To enable IBM Spectrum Scale to read data from the replica with a faster response time, update the readReplicaPolicy parameter:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /home/db2inst1/sqllib/bin/db2cluster -cfs -set -option readReplicaPolicy -value fastest
      The shared file system cluster parameter 'readReplicaPolicy' has been successfully updated to 'fastest'.
    20. On an AIX® InfiniBand network, AIX RoCE network with IP support, Linux® RoCE network, or TCP/IP network on AIX and Linux, update the IBM Spectrum Scale cluster to use the private network to communicate between sites A and B. This enables the clustering software to detect network issues between the sites, and trigger failover accordingly. First, check the subnet for the network:
      root@hostA1:/opt/IBM/db2/V11.5/bin> ping hostA1-ib0
      PING hostA1-ib0.torolab.ibm.com (10.5.1.1): 56 data bytes
      64 bytes from 10.5.1.1: icmp_seq=0 ttl=255 time=0 ms
      In this example, subnet 10.5.1.0 includes all the IP addresses from 10.5.1.0 through 10.5.1.255:
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmchconfig subnets=10.5.1.0
      mmchconfig: Command successfully completed
      mmchconfig: Propagating the cluster configuration data to all
      affected nodes. This is an asynchronous process.
      
      root@hostA1:/opt/IBM/db2/V11.5/bin> /usr/lpp/mmfs/bin/mmlsconfig
      Configuration data for cluster db2cluster_20110224005554.torolab.ibm.com:
      --------------------------------------------------------
      clusterName db2cluster_20110224005554.torolab.ibm.com
      clusterId 655893150084494058
      autoload no
      minReleaseLevel 5.0.2.0
      dmapiFileHandleSize 32
      maxFilesToCache 15000
      pagepool 2G
      verifyGpfsReady yes
      assertOnStructureError yes
      workerThreads 512
      sharedMemLimit 2047M
      usePersistentReserve no
      failureDetectionTime 48
      leaseRecoveryWait 35
      [T]
      unmountOnDiskFail yes
      [common]
      subnets 10.5.1.0
      [hostA1]
      psspVsd no
      adminMode allToAll
      
      File systems in cluster db2cluster_20110224005554.torolab.ibm.com:
      ------------------------------------------------------------------
      /dev/db2fs1
    21. Update the RSCT communication groups to disable Loose Source Routing (LSR). When LSR is disabled, RSCT will use daemon routing, which is a more reliable communication method in the event of isolated network failures. First list all the communication groups used by RSCT, and then update each separately:
      root@hostA1:/> lscomg
      Name Sensitivity Period Priority Broadcast SourceRouting NIMPathName NIMParameters Grace MediaType UseForNodeMembership
      CG1  4           1.6    1        Yes       Yes                                     60     1 (IP)    1
      CG2  4           1.6    1        Yes       Yes                                     60     1 (IP)    1
      root@hostA1:/> chcomg –x r CG1
      root@hostA1:/> chcomg –x r CG2
      root@hostA1:/> lscomg
      Name Sensitivity Period Priority Broadcast SourceRouting NIMPathName NIMParameters Grace MediaType UseForNodeMembership
      CG1  4           1.6    1        Yes       No                                      60     1 (IP)    1
      CG2  4           1.6    1        Yes       No                                      60     1 (IP)    1

      Note that if at anytime the db2cluster -cm -delete -domain/create domain commands are run to recreate the TSA domain, then LSR needs to be disabled again.

      For better resilience during Ethernet failures, update /etc/hosts on all hosts in the cluster to contain a mapping from each host name to it’s IP address (note that from earlier in this step, host T’s /etc/hosts file will differ from the below, as its –ib0 hostnames will map to the standard Ethernet hostname). This setting prevents some Db2 Cluster Services monitor commands from hanging in the event that one of the DNS servers at a site has failed:
      root:/> cat /etc/hosts
      10.5.1.1	hostA1-ib0.torolab.ibm.com	hostA1-ib0
      10.5.1.2 	hostA2-ib0.torolab.ibm.com	hostA2-ib0
      10.5.1.3	hostA3-ib0.torolab.ibm.com	hostA3-ib0
      10.5.1.4	hostB1-ib0.torolab.ibm.com	hostB1-ib0
      10.5.1.5	hostB2-ib0.torolab.ibm.com	hostB2-ib0
      10.5.1.6	hostB3-ib0.torolab.ibm.com	hostB3-ib0
      9.26.82.1	hostA1.torolab.ibm.com	hostA1 
      9.26.82.2	hostA2.torolab.ibm.com	hostA2 
      9.26.82.3	hostA3.torolab.ibm.com	hostA3
      9.26.82.4	hostB1.torolab.ibm.com	hostB1
      9.26.82.5	hostB2.torolab.ibm.com	hostB2 
      9.26.82.6	hostB3.torolab.ibm.com	hostB3
      9.23.1.12   T

What to do next

After the cluster has been installed and is running, set up IBM Spectrum Scale replication.