IBM Support

3 Ways to resolve CWRLS0030W

Technical Blog Post


Abstract

3 Ways to resolve CWRLS0030W

Body

Background:
High Availability (HA) Manager is a framework that allows singleton services to make themselves highly available. Transaction Log Recovery is one of the HA Manager users.

Every WebSphere Application Server process is a member of a HA Manager DCS CoreGroup. During server (cluster member) startup, transaction manager and other server components will also get started. Transaction Manager should have exclusive ownership of its transaction recovery log file before it can initialize successfully. HA Manager assigns the ownership to the Transaction Manager only when all the servers in the coregroup are discovered and connected.

Due to a server hang, out of memory (OOM) condition, network issue, or any other unexpected issue, the servers in the coregroup can't be connected with each other. HA Manager will not assign the ownership of the Transaction recovery log file until the problem is resolved and writes "CWRLS0030W: Waiting for HAManager to activate recovery processing" message in the SystemOut.log file.

Solution 1:
Look for a DCSV8030I message just before or after the CWRLS0030W message. Restart the server in the ConnectedSetAdditional and ConnectedSetMissing list. The list contains the servers having an issue connecting to the failing server. Please review the document "CWRLS0030W message continuously logged and WebSphere Application Server fails to open for e-business" for more information.

Note: If you don't see the DCSV8030I message before or after then CWRLS0030W message then look for inconsistency view in the DCSV8050I message (AV, CN and CD not same, see blog entry "Top 10 things to know about High Availability Manager (HAManager) in WebSphere Application Server for more info).

a) Stable view (AV,CN and CD same)

May be a legitimate transaction problem where tranlogs are not being released. That may happen, when you see the problem in a clustered environment on which more than one jvm is part of a cluster. If peer recovery is enabled and one jvm is hanging during peer recovery processing, all other jvms of that cluster need to wait until the jvm in question that is hanging release the lock to the tranlog. So it is maybe not worth to check, if any other hanging jvm lead to the issue.  A workaround in that case would be to disable peer recovery processing in the cluster and to enable the jvm property com.ibm.ws.recoverylog.disableNonHARegistration=true on all jvms. But that should be just a workaround! The solution would be to find out the root-cause of that hanging jvm.  Instructions to disable peer recovery can be found at URL:

https://www.ibm.com/support/knowledgecenter/SS7K4U_9.0.5/com.ibm.websphere.zseries.doc/ae/tjta_cfgpeer.html


b) Instable view (AV,CN and CD not same)

Make sure HAManager is enabled on all servers in the coregroup. We have seen this issue when HA Manager is enabled on just the problematic server and diabled on all other servers in the coregroup.

Solution 2:
This solution is applicable only when you have an OutOfMemory (OOM) issue on any other servers in the coregroup. The OOM server can't communicate or discover with other servers running in the coregroup. If the server fails to start with CWRLS0030W due to an OOM condition on any other server, set IBM_CS_OOM_ACTION to Isolate in the coregroup custom property.

To create the customer property:

  • Click on Servers => Core Groups => Core group settings => CORE_GROUP_NAME => Custom properties => New
  • Enter IBM_CS_OOM_ACTION in Name Field and Isolate in Value field
  • Save, Sync the changes and restart the servers

About the property:
Use this custom property to explicitly enable exception handlers that are specific for OutOfMemoryExceptions that occur when sending or receiving network messages. When this property is set to Isolate, if an OutOfMemoryException occurs when a network message is being sent or received, these exception handlers stop High Availability Manager communications to the Out of Memory process.

For more information, please review APAR: PM27892 Server fails to start with CWRLS0030W due to Out Of Memory condition in another process in cell

Solution 3:
The com.ibm.ws.recoverylog.disableNonHARegistration custom property stops the recovery log service from registering with the HAManager when the "High availability transaction log" feature is not enabled. This means that file locking will not be used. There will be no dependency on the HAManager from a recovery log service perspective. This feature/custom property can be applied only on version 7.0.0.31, 8.0.0.8, 8.5.5.2 and above.

To create the customer property:

  • Click Servers => Server Types, and either WebSphere application servers => server_name => Under Server Infrastructure, click Java and process management => Process definition => Java virtual machine => Custom properties => New
  • Enter com.ibm.ws.recoverylog.disableNonHARegistration in Name Field and true in Value field
  • Save, Sync the changes and restart the servers

For more information, please review APAR: PM95664 Unexpected results of transaction recovery service in relation to HA MGR

title image (modified) credit: (cc) Some rights reserved by Sannita

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"","label":""},"Component":"","Platform":[{"code":"","label":""}],"Version":"","Edition":"","Line of Business":{"code":"","label":""}}]

UID

ibm11081071