ZooKeeper-automated failover

As of Db2® Big SQL 5.0.4, high availability (HA) is no longer managed by Tivoli® System Automation (TSA). HA is now managed through Python scripts that interact with Apache ZooKeeper, a hierarchical key-value store that is used by several other Hadoop services for HA solutions, such as HDFS and YARN.

If you are running Db2 Big SQL with two head nodes, HA ensures that one head node is always the primary, and the other head node (if available) is the standby that takes over when needed.

The scripts that interact with ZooKeeper comprise the Db2 Big SQL HA failover controller (FC). The FC periodically checks the status of the primary head node and checks (through ZooKeeper) the status of the standby head node. If the primary head node is not running, the FC triggers an automatic failover and updates ZooKeeper to reflect that the standby is now the primary head node. If you trigger a failover manually, ZooKeeper is also updated to identify the new primary and standby head nodes.

If the primary head node encounters any problems, the FC attempts to resolve them. If the primary head node is shut down, or the problems cannot automatically be resolved, the standby head node takes over as the primary head node.

To enable the FC, run the following command:
/usr/ibmpacks/IBM-Big_SQL/current/bigsql-cli/bigsql-admin -failoverController Enable -startFC
To start the FC, run the following command:
/usr/ibmpacks/IBM-Big_SQL/current/bigsql-cli/bigsql-admin -startFC
To fully start Db2 Big SQL and the FC, run the following commands:
/usr/ibmpacks/IBM-Big_SQL/current/bigsql-cli/bigsql-admin -start

/usr/ibmpacks/IBM-Big_SQL/current/bigsql-cli/bigsql-admin -startFC
To disable the FC, run the following command:
/usr/ibmpacks/IBM-Big_SQL/current/bigsql-cli/bigsql-admin -failoverController Disable
For more information, see Db2 Big SQL cluster administration utility.

TSA is disabled as part of the upgrade process. When you enable HA after upgrading, FC functionality becomes active.

The following required Python modules are included in the bigsql-dist package and are installed automatically:
  • Kazoo, a Python API for ZooKeeper
  • PyYAML, an interface for reading and writing the YAML configuration files that are used by the FC
  • Subprocess32, an interface for system calls (for example, db2 and db2pd commands)
  • Fasteners and monotonic, used for a lock mechanism by the script that runs db2pd commands
The command line interface has -failoverController options (start, stop, and status). For example:

bigsql start -failoverController
bigsql status -failoverController
bigsql stop -failoverController
These options can be used only on the head nodes.

A log file for the HA failover controller that you can use for troubleshooting is located in the Db2 Big SQL logging directory, which is specified in Ambari and is /var/ibm/bigsql/logs/ by default.