[UNIX, Linux, Windows, IBM i]

Applications not balancing correctly

Many symptoms relating to application balancing can be diagnosed using the DISPLAY APSTATUS command in various ways.

DIS APSTATUS(X) TYPE(APPL)

Symptom

The expected application is not listed.

Solution
  • Verify the APPLTAG field is set correctly, either in code, or when the application is started.
  • Investigate other listed applications in DIS APSTATUS(*) output to see if any are unexpected due to the name being formed incorrectly, or defaulting.
  • Try running the command DIS APSTATUS(X) TYPE(LOCAL) where(MOVABLE eq NO) on each queue manager in the uniform cluster, to look for application instances which can not be distributed around the uniform cluster.

Symptom

The expected total number of applications are not listed.

Solution
  • Verify you are actually launching the expected number of instances to connect to the uniform cluster
  • Verify that the uniform cluster is communicating correctly and all queue managers are reporting application counts in DIS APSTATUS(X) TYPE(QMGR).

Symptom

The expected total number of applications are listed but some applications are flagged as not movable.

Solution

On each queue manager in the uniform cluster, use DIS APSTATUS(X) TYPE(LOCAL) where (MOVABLE equals NO) and investigate the IMMREASN field.

Symptom

The balanced state is UNKNOWN

Solution

This is a temporary state, and will resolve itself shortly. Retry the command in a while.

Symptom

The balanced state is NOTAPPLIC.

Solution
  • If this queue manager is not in a uniform cluster, the balance state is always NOTAPPLIC as nothing can be rebalanced.
  • In a uniform cluster, this means there has never been an application with this name connecting as movable. Information on this application is not distributed around the cluster.

    Use DIS APSTATUS(X) TYPE(LOCAL) where(MOVABLE eq NO) and investigate the IMMREASN field.

Symptom

The balanced state is NO

Solution
  • Monitor this output across a period of time. If applications constantly connect and disconnect this might be the appropriate answer as the instances are not given chance to rebalance.
  • Use DIS APSTATUS(X) TYPE(QMGR) to investigate the numbers on each queue manager, which indicates queue managers with a surplus, or deficit, number of instances and continue the investigation on those queue managers.

DIS APSTATUS(X) TYPE(QMGR)

Symptom

Not all queue managers in the uniform cluster are listed.

Solution
  • Verify the BALSTATE is not NOTAPPLIC as that prevents information being flown around the uniform cluster.

    Use DIS APSTATUS(X) TYPE(LOCAL) to look at the IMMREASN field.

  • Verify any missing queue managers are running.
  • Verify the state of clustering, and that channels are running between this queue manager and the missing queue manager.

Symptom

A queue manager is listed as ACTIVE(NO)

Solution
  • Verify any missing queue managers are running
  • Verify the state of clustering and that channels are running between this queue manager and the inactive queue manager

Symptom

A queue manager has some immovable instances of an application.

Solution

On that queue manager in the uniform cluster, Use DIS APSTATUS(X) TYPE(LOCAL) where(MOVABLE eq NO) and investigate the IMMREASN field.

Symptom

The BALSTATE is unexpected.

Solution
  • Monitor this over time, as the BALSTATE is the state when the queue manager last attempted to rebalance applications, which only happens periodically
  • Are applications continually connecting and disconnecting? If so, this might prevent the application ever being rebalanced into a stable state.
  • If BALSTATE stays unbalanced, look at the error logs on the queue managers that are BALSTATE(HIGH) and BALSTATE(LOW), which should indicate whether they are requesting application instances, and how many were permitted to be moved.
  • Verify DIS APSTATUS(X) TYPE(LOCAL) where(IMMCOUNT gt 1) to see if there are instances which are failing to move when requested.

DIS APSTATUS(X) TYPE(LOCAL)

This display command can be used to diagnose many issues which might cause an application not to rebalance as expected. Firstly, check the IMMDATE and IMMTIME fields to see if the application is only temporarily marked as immovable.

Other reasons for applications failing to rebalance are indicated by the IMMREASN. The following table shows the various causes (IMMREASN) and actions needed. Note, that in most cases, these causes need to be reviewed with the application developer or owner concerned
IMMREASN ACTION
NOTCLIENT The application is using server bindings and therefore cannot be moved to another queue manager. In most cases applications can be modified to use a client connection. This might require rebuilding the application however depending on the language and library versions in use.
NOTRECONN The application connection is not marked as 'reconnectable'. This might be a deliberate decision in the application code, because its design requires that all messages flow to and from a single queue manager, or might indicate a configuration error or oversight (for example very old client libraries do not support reconnection).

Note that for application balancing to work RECONNECT_QMGR is not sufficient, as this indicates reconnection is only permitted to the 'same' Queue Manager instance. To see the connection options in use from an application instance you can issue DIS CONN(*) TYPE(CONN) WHERE(CONNTAG eq 'xxx') CONNOPTS, where xxx is the CONNTAG from the DIS APSTATUS output.

APPNAMECHG The application is making multiple connections on the same TCP connection, but with differing application names. This means that application instances cannot be reliably separated so rebalancing is prevented. If this issue occurs the application code is probably explicitly overriding the application name in the MQCONNX call.
MOVING This should be a temporary status only as it indicates that the application instance has already been identified for rebalancing.
[MQ 9.3.0 Jun 2022]INTRANS The application is currently in a transaction so rebalancing is avoiding interruption (rollback). If the application developer or deployer is not concerned about excessive rollbacks for this application, and would prefer to prioritize maintaining a consistent balance of application connections, this constraint can be ignored both in application code or configuration settings; see BalanceOptions for more information.

Alternatively, how long the queue manager permits transactions to continue before considering interruption anyway can be modified using the Timeout field.

[MQ 9.3.0 Jun 2022]REPLY This application has been marked as type 'request reply' and is waiting for a response to a previously dispatched request message. If you do not want to wait for responses, marking as type 'SIMPLE' prevents this wait.

Alternatively, you can configure the extent of the waiting period using either the message expiry of the application request messages, or the Timeout. Note that it is often best to configure both appropriately, so that Timeout does not unexpectedly truncate wait times for responses.