Since it seems that the logic of Failure Detection Time(FDC) of TSA(RSCT) is changed recently, please let me check.
As far as I know, FDC can calculable by "HeartBeatPeriod * Sensitivity * 2" with a former release of TSA.
And we can check it with the value of "trip interval" of "lssrc -ls cthats" command.
But the the value of "trip interval" has changed by the latest release as below.
>HB Interval = 1.000 secs. Sensitivity = 4 missed beats
>Missed HBs: Total: 0 Current group: 0
>Packets sent : 576112 ICMP 0 Errors: 0 No mbuf: 0
>Packets received: 749092 ICMP 0 Dropped: 0
>NIM's PID: 2527
> 2 locally connected Clients with PIDs:
> rmcd( 1945) hagsd( 2545)
> Dead Man Switch Enabled:
> reset interval = 1 seconds
> trip interval = 16 seconds
> Watchdog module in use: vmwatchdog
> Client Heartbeating Enabled. Period: 8 secs. Timeout: 16 secs.
It seems Client Heartbeating Period is the equal to FDC.
And Timeout seems to be twice the Period.
Does this mean that it takes double time than former release as for detection of heart beat failure ?
Is the value of Client Heartbeating Period always the same as FDC?
Notice: We have upgraded developerWorks Community to the latest version of IBM Connections. For more information, read our upgrade FAQ.
Pinned topic Failure Detection Time(FDC) of TSA(RSCT)
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2012-04-27T03:47:59Z at 2012-04-27T03:47:59Z by amanabe
sedgewick_de 1000004PXP36 Posts
Re: Failure Detection Time(FDC) of TSA(RSCT)2012-04-26T09:08:38ZThis is the accepted answer. This is the accepted answer.Manabe-san,
it seems I missed this post, my excuses for that.
TSA/RSCT has been equipped with an additional grace period. In case heartbeating fails multiple times exceeding sensitivity, TSA/RSCT sends out an ICMP ping to its neighbor and waits for the time specified as grace period for a response. In case a response is received the node is not considered dead, but only responding too slowly to heartbeat packages.
This grace period handling is responsible for extending the time before a node is considered dead. The default grace period is -1 indicating that HATS as TSA/RSCT component computes the period from heartbeat sensitivity and frequency, any positive number is taken as seconds and the value 0 deactivates the grace ICMP ping. In the last case the former values for computing FDR will hold true