A fix is available
APAR status
Closed as program error.
Error description
IXC633I GROUP SYSGRS MEMBER xxx JOB GRS ASID 0007 628 CONFIRMED IMPAIRED AT xx/xx/xxxx xx:xx:xx.xxxxxx ID: 1.2 . IXC636I GROUP SYSGRS MEMBER H019 JOB GRS ASID 0007 IMPAIRED, IMPACTING CRITICAL FUNCTION GLOBAL ENQ PROCESSING . *IXC635E SYSTEM xxxx HAS IMPAIRED XCF GROUP MEMBERS . IXC615I GROUP SYSGRS MEMBER xxxx JOB GRS ASID 0007 SFM TERMINATING SYSTEM TO RELIEVE IMPAIRMENT CONDITION . . The above messages are seen when GRS's status exit informs XCF that it is impaired. It makes this determination after examing the ISGQDR and ISGWDR tasks in the GRS address space. In some cases, it's possible that these tasks are not running because of another system in the SYSPLEX and not because of THIS system. This errorneous assumption causes XCF to partition this system out of the SYSPLEX, leaving the real system that is causing the problem still around, such that, SYSPLEX SYMPATHY continues. In the case noted in the field, another system was capped at a very low value, by error. This low capping resulted in slow downs of signal processing, and particularly for SYSGRS group. However, the system was still able to update it's status, thus SFM did not take action against this low capped system. On another system, GRS needed to process the LIST DRAIN, this meant that the LISTLOCK was obtained and then signals were sent to all the other systems, including the low capped system. When that system failed to respond in a timely manner, the system initiating the LIST DRAIN appeared as though it was impaired and action was taken. This system was really just waiting for response from the low weighted system, but, this "wait" caused GRS to inform XCF that it appears to be impaired, and thus, XCF removed this system from the plex. In the case where the system is waiting for the LIST DRAIN, GRS should NOT consider itself impaired. It's still possible this system is the problem, however, GRS cannot make that determination definitive, and thus, should not make any determination at all. The problem here is that a perfectly healthy system was removed in an attempt to relieve impaired, however, that action provide no relief at all and actually was more detremental, since a perfectly healthy system, who was victim of another system, was removed. . VERIFICATION STEPS: ------------------- 1) If a SADMP is taken of the system that issued the IXC633I message for SYSGRS, then, check to see if the following two bits are on: SST_HoldForListDrain SST_ListLockOwned The SST is located via CVT+1B0?+10?+204? (cvt --> gvt --> gvtx --> sst) . 2) Check to see if there are signals outstanding to other systems via "IP XESDATA CONNECTION STR(ISGLOCK) DETAIL". There is a timestamp in this output for "pending" signals to other systems, compare this with when the IXC633I was seen, to see if it's slow in getting a response. . 3) Even after this system was removed via SFM, problems persist and root cause is determined to be another system in the SYSPLEX. . If the above 3 match, then is your problem. This APAR will prevent GRS from making this assumption that it is impaired. It still cannot determine the system that is causing the issue, however, making no decision is better than making the wrong decision. This will prevent a perfectly healthy system from being removed from the SYSPLEX.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: All users of HBB7780 and above in * * GRS Star Mode * **************************************************************** * PROBLEM DESCRIPTION: ISGXSTAX did not recognize the * * SST_HoldForListDrain flag as an * * indication that the delay is due to * * the list lock as opposed to the system * * being sick. As a result of this, it * * was possible for a healthy system * * to declare itself impaired when it * * was waiting for another (sick) * * system. * **************************************************************** * RECOMMENDATION: * **************************************************************** See Problem Description
Problem conclusion
ISGXSTAX now takes SST_HoldForListDrain into account when ISGWDR appears stalled. If SST_HoldForListDrain is on then ISGXSTAX does not declare itself impaired. KEYWORDS: GRSSTAR/K
Temporary fix
Comments
APAR Information
APAR number
OA47993
Reported component name
CROSS SYS.EXT.S
Reported component ID
5752SCIXL
Reported release
790
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2015-06-02
Closed date
2015-08-12
Last modified date
2015-09-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
UA78606 UA78607 UA78608
Modules/Macros
ISGXSTAX
Fix information
Fixed component name
GRS
Fixed component ID
5752SCSDS
Applicable component levels
R7A0 PSY UA78606
UP15/08/26 P F508
R780 PSY UA78607
UP15/08/26 P F508
R790 PSY UA78608
UP15/08/26 P F508
Fix is available
Select the PTF appropriate for your component level. You will be required to sign in. Distribution on physical media is not available in all countries.
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"790","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"790","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
01 September 2015