IBM Support

Restrictions of automated adapter liveliness test

Question & Answer


Question

Db2 pureScale deployment is simplified by removing the requirement of configuring pingable IP interfaces on high speed interconnect switches. The adapter port liveliness test has been automated. This automated adapter liveliness detection does not work on certain configurations.

When an adapter is detected down, RSCT detects and logs it in the system logs as follows:

Sep  7 05:28:42 host1 daemon:notice cthats[7602322]: (Recorded using libct_ffdc.a cv 2):::Error ID: 6zV5DL.urMYP/X1M/8J.1h....................:::Reference ID: :::Template ID: 173c787f:::Details File:  :::Location: rsct,nim_control

.C,1.39.1.43,5929             :::TS_LOC_DOWN_ST Possible malfunction on local adapter Adapter interface name hca0 Adapter offset 2 Adapter IP address 10.1.1.101

In turn, Db2 is notified and logs this event in db2diag.log:

2018-09-07-05.28.42.412244+540 E2325A594            LEVEL: Event

PID     : 14483680             TID : 542            PROC : ca-wdog 128 [db2inst1]

INSTANCE: db2inst1             NODE : 128

HOSTNAME: host1

EDUID   : 542                  EDUNAME: ca-wdog 128 [db2inst1]

FUNCTION: DB2 UDB, high avail services, rocmHCAMonitorCallback, probe:911

MESSAGE : ADM7537I  The status of the following adapter changed.  Adapter name:

          "hca0".  New status: "offline".  Number of adapters that are

          currently online: "1".  Host name: "host1".

CHANGE  : Communication adapter port

If an adapter is down and these log entries are missing in the above specified files, then Db2 is unable to automatically detect the adapter as down. Depending on which adapter is affected, some of the expected symptoms are:

  1. The resource on the host with the down adapter being marked Failed Offline
  2. FODC_Panic of the member
  3. CF in ERROR state
  4. Total Cluster Outage

Cause

Due to the limitations of some OSes and/or platforms, adapter state is not reflected to enable automated detection.

Answer

On the following environments, pingable IP interfaces need to be configured to enable Reliable Scalable Cluster Technology (RSCT) to monitor the network -

  • All Db2 pureScale supported environments with LHEA virtualisation
  • Intel and Power Linux environments with SEA virtualisation
  • Intel and Power Linux environments with SRIOV virtualisation

On these environments, refer to IBM Db2 Knowledge Center to setup netmon.cf file properly.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"V11.1","Edition":"FP4+","Line of Business":{"code":"LOB10","label":"Data and AI"}},{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Component":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"V11.5","Edition":"All versions","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Product Synonym

Db2 pureScale

Document Information

Modified date:
20 June 2019

UID

ibm10733765