Problems and solutions for preventing WebSphere eXtreme Scale container from hanging during shutdown

When multiple WebSphere eXtreme Scale container members are shut down concurrently, some containers may become overloaded with data, causing one or more container members to run out of memory and terminate ungracefully. If this occurs, the remaining WebSphere eXtreme Scale container members may hang during shutdown.

Symptoms

When multiple WebSphere eXtreme Scale container members are shut down concurrently, some containers may become overloaded with data, causing one or more container members to run out of memory and terminate ungracefully. If this occurs, the remaining WebSphere eXtreme Scale container members may hang during shutdown.

To prevent this from occurring, follow the below procedure during WebSphere eXtreme Scale cluster shutdown startup when either of the following conditions were truee when the B2BAC cluster was last in use:
  • Memory utilization in the eXtreme Scale grid was high (60% or more).
  • AS4 Microservices is shutting down in an interval in which either the low or high memory watermark thresholds had been reached. See Configuring the data grid cache memory watermarks for more information about memory watermarks.

Resolving the problem

To gracefully start a AS4 Microservices cluster in which the WebSphere eXtreme Scale catalog members have high memory utilization:

  1. Shut down each information member.
  2. Shut down each operational member.
  3. On each AS4 Microservices node running Linux with WXSCatalog enabled, navigate to the following directory in a command line program:: cd <b2bacinstall>/Members/WXSCatalog/bin, in which <b2bacinstall> is the AS4 Microservices install directory.
    Important: The abovementioned directory is correct for Linux. When running AS4 Microservices on Windows, navigate to the following directory in a command line program:: cd <b2bacinstall>\Members\WXSCatalog\bin, in which <b2bacinstall> is the AS4 Microservices install directory.
  4. Determine the username and password of the WXS griduser.
    Note: To do this, open the /Members/resources/SystemConfigurationXSLoader.properties* file (for Linux) or the \Members\resources\SystemConfigurationXSLoader.properties* file (for Windows) in a text editor or viewer. A line with format gridUser= defines the username, and a line with the format gridPassword= contains the encoded version of the grid user's password.
  5. On the catalog member, execute the following teardown command to ensure graceful termination of catalogs and containers: <xscmd> -c teardown -user <UserName> -pwd <DecodedPassword> -sl <CONTAINER_1,CONTAINER_2,...,CONTAINER_N>, where:
    • <xscmd> is ./xscmd.sh on Linux and ./xscmd.bat on Windows
    • <UserName> is the user name obtained in step 4
    • <DecodedPassword> is the decoded version of the password obtained in step 4
    • <CONTAINER_1,CONTAINER_2,...,CONTAINER_N> is a comma-delimited list the container names in the cluster.
    Note: To obtain the names of your containers, view the member.xml file located at <b2bacinstall>/Members/WXSContainer/usr/servers/WXSContainer/member.xml* (for Linux) or <b2bacinstall>\Members\WXSContainer\usr\servers\WXSContainer/member.xml* (for Windows) on each B2BAC node with a WXSContainer member installed. For example, the following member.xml defines a Container named CONTAINER_3:
    <?xml version="1.0" encoding="UTF-8"?>
    <server description="MEGMember configuration"
          <com.company.b2b.system.config.member member.id="0154bff0cf07_MBR"
                                                member.name="CONTAINER_3"  member.hostName="b2bac-host"
                                                member.rootPath="/b2bacInstall/Members/bin/../WXSContainer/usr/servers/WXSCon
                                                member.memberType="CONTAINER"  member.version="1.0.0.4"/>
    </server>
  6. Follow the command-line prompts from the xscmd command, an example of which is excerpted below:
    ...SNIP... 
    CWXSI0068I: Executing command: teardown 
    ...SNIP... 
    ***The     following servers are being stopped: 
    CONTAINER_1 
    CONTAINER_2 
    ...SNIP... 
    Do you want to tear down the listed servers? (Y/N) 
    To respond to a prompt, type Y and press ENTER or RETURN, and a confirmation such as the below example is displayed.
    Server Result Message 
    ------ ------ ------- 
    CONTAINER_1 succeeded
    CONTAINER_2 succeeded 
    ...SNIP... 
    CWXSI0040I: The teardown command completed successfully.
    ...SNIP... 
  7. Verify that the grids have been shut down by navigating to the following directory in a command line program: cd <b2bacinstall>/Members/bin*, where <b2bacinstall> is the directory in which AS4 Microservices was installed.
    Next, run the <execute> grid status all command, where <execute> is ./execute on Linux and ./execute.bat on Windows. Then, verify that the output of the command is as follows:
    Executing status command...  
         Grid Name            Grid Status 
         IdentityGrid         UNKNOWN  
         MegBase              UNKNOWN  
         MegComms             UNKNOWN  
         MegInfrastructure    UNKNOWN 
    Command completed successfully. 
  8. Shut down each container member using the normal mechanism.
  9. Shut down each catalog member.