APAR status
Closed as program error.
Error description
When a PowerHA/GLVM cluster is inactive all Remote Physical Volume Client and Server devices are in Defined state on all nodes. When the cluster is started and the first node, which joins, acquires a RG with GMVG_REP_RESOURCE resources, it starts a cl_sync_vgs process after the GLVM is varied on, even if a syncvg process can't succeed while the RPV server devices at the remote node are still in Defined state. Once a remote cluster nodes starts and acquires the RG into 'ONLINE SECONDARY' state (RPV server devices are made Available), the primary node is supposed to start another cl_sync_vgs process. But in large environments it can happen that the first cl_sync_vgs process is still running (serveral syncvg processes have already failed) and so no new cl_sync_vgs process is started: +[RG]:glvm_join_cleanup[285] /usr/bin/ps -eo pid,args +[RG]:glvm_join_cleanup[285] grep -w cl_sync_vgs +[RG]:glvm_join_cleanup[285] grep -vw grep +[RG]:glvm_join_cleanup[285] grep -w datavg +[RG]:glvm_join_cleanup[285] 1> /dev/null 2>& 1 +[RG]:glvm_join_cleanup[286] ((0!=0)) If that happens LVs of the GLVM VG remain unsynchronized (stale).
Local fix
Manually sync stale LVs, if there are any.
Problem summary
In larger environment, on cluster start with GLVM configuration, once the primary node is online cl_sync_vgs is triggered without checking the RPV servers are in defined state in remote nodes leading to cl_sync_vgs failed and LVs remain unsynchronized.
Problem conclusion
As fix checking the state of RG in cl_sync_vgs to ACQUIRING SECONDARY or ONLINE SECONDARY before proceeding to sync to avoid the fail scenario.
Temporary fix
Comments
APAR Information
APAR number
IJ45674
Reported component name
POWERHA SYSMIR
Reported component ID
5765H3900
Reported release
727
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2023-03-02
Closed date
2023-03-02
Last modified date
2023-03-02
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
POWERHA SYSMIR
Fixed component ID
5765H3900
Applicable component levels
[{"Business Unit":{"code":"BU008","label":"Security"},"Product":{"code":"SGL4G4","label":"PowerHA"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"727"}]
Document Information
Modified date:
02 March 2023