Validating a GDPC through testing

Geographically dispersed Db2® pureScale® cluster (GDPC) ensures that the cluster is running if a total site failure were to happen at a site. There are multiple tests that can be run to ensure that GDPC is running properly.

Running these tests before a disaster strikes will ensure that the geographically dispersed Db2 pureScale cluster (GDPC) has been correctly configured, and help with a hands-on understanding of the expected behavior in each failure case:
  1. Soft member failure – kill -9 of the db2sysc process.
  2. Soft cluster cache facility (CF) failure, primary or secondary – kill -9 of the ca-server process.
  3. Hard member failure – reboot of the member LPAR.
  4. Hard CF failure, primary or secondary – reboot of the CF LPAR.
  5. Storage shutdown at one site.
  6. Shutdown of the tiebreaker site.
  7. Pull the cable for the private network over where member-to-CF communicate (either over IB, or using RoCE or TCP/IP) from one site.
  8. Shutdown the IB or RoCE switch at one site.
  9. Pull the Ethernet cable at one site.
  10. If you are using IB, shutdown the longbow at one site.
  11. Site failure – shutdown the LPARs, switches, Longbow, and storage at one site to simulate a total site failure.

Note that there are variations of site failure. The type of failure depends on whether the failed site is the IBM® Spectrum Scale cluster manager, the file system manager, the RSCT group leader, the TSA master, or if it is the primary or secondary CF