Operation of automatic client reroute for connections to the Db2 for z/OS server from an application other than a Java application

The automatic client reroute (ACR) feature provides failover support when an application loses connectivity to a member of a Db2 for z/OS data sharing group. The ACR feature enables an application to recover from a connection failure by reconnecting through an available member of the Db2 for z/OS data sharing group.

When you enable the Sysplex workload balancing feature, the ACR feature is enabled by default. Client support for the ACR feature is available in the IBM® data server clients with a Db2 Connect license.

Non-Java application connected to a data sharing group with Sysplex enabled

The following example demonstrates ACR operation when an application other than a Java™ application connects to a Db2 for z/OS data sharing group with the Sysplex workload balancing feature enabled:
  1. The data server returns the following items as part of the response to a commit request from the client:
    • An indicator that specifies whether transports can be reused. Transports can be reused if there are no session resources remaining, such as held cursors.
    • The SET statements that the client can use to replay the connection state during transport reuse. The SET statements are also known as special registers.
    • The session global variables, if a connection is made to Db2 for z/OS Version 11 in new function mode (NFM).
  2. If the first SQL statement in a transaction fails due to a connection loss, the following behavior occurs:
    • No error is reported to the application.
    • The failing SQL statement is run again on the next available member in the returned server list.
    • The SET statements that are associated with the connection are replayed to restore the connection state.
  3. The following example applies to only Db2 for z/OS Version 11 in NFM. If the detectReadonlyTxn keyword is set to true and the connection fails in a transaction, the following behavior occurs:
    • If the completed statements in the transaction before connection failure are read-only and complete result sets from the transaction up to the point of connection failure are available to the client, the following behavior occurs:
      • No error is reported to the application.
      • The SET statements that are associated with the connection are replayed to restore the connection state. The special registers and session global variables are reissued.
      • The failed SQL statement is run again in a new connection that is made to the next available member, which is based on the returned server list. The failed SQL statement does not have to be read-only, but completed statements that are issued before the failure must be read-only. Also, the transaction cannot use the repeatable read (RR) or read stability (RS) isolation level.
    • If a transaction up to the point of connection failure is not read-only, the following behavior occurs:
      • The transaction is rolled back.
      • The application is reconnected to the Db2 data sharing group.
      • The SET statements (special registers) that are associated with the connection are reissued to restore the connection state.
      • The SQL30108N error is returned to the application to notify it of the rollback and successful reconnection. The application must include code to handle the message and retry the failed transaction.
  4. If an SQL statement that is not the first SQL statement in a transaction fails and transports can be reused, the following behavior occurs:
    • The transaction is rolled back.
    • The application is reconnected to the data server.
    • The SET statements (special registers) that are associated with the connection are replayed to restore the connection state.
    • The SQL30108N error is returned to the application to notify it of the rollback and successful reconnection. The application must include code to retry the failed transaction.
    The transports can be reused if no session resources remain open.
  5. If an SQL statement that is not the first SQL statement in a transaction fails and transports cannot be reused, the following behavior occurs:
    • The special registers and global variables are reset to the values that were in effect at the last commit point.
    • The SQL30081N error is returned to the application to notify it that reconnection was unsuccessful. The application must reconnect to the data server, reestablish the connection state, and retry the failed transaction.
  6. If all the data, including the end of file (EOF) character, is returned in the first query block or in a subsequent fetch request, the CLI driver can perform seamless failover when you issue a COMMIT or ROLLBACK statement after the member becomes unreachable. If you declare any session resources in a transaction, seamless ACR connections are changed into non-seamless ACR connections. The session resources include the following ones:
    • Open cursors with cursor-hold behavior
    • Open cursors with locators
    • Declared global temporary tables
    • Accelerated queries
    You can configure the FetchBufferSize keyword to ensure that the size of the result set that the CLI driver prefetches is sufficient to include the EOF character in the first query block. See the Related reference section for further details on the FetchBufferSize parameter.
  7. When using the Sysplex distributed IP address, the connection is always established to the Sysplex distributed IP. After a successful connection, the server list will be returned to the driver. If a member is not available from the server list during the transaction, a retry attempt will be made to the Sysplex distributed IP to get the next available member. If another member is not available, the driver will throw connection error -30081. With the Sysplex distributed IP address, there is no need for other keywords (except enableWLB) to fine-tune Sysplex workload balancing from the client side.

Behavior of a non-seamless ACR feature

The behavior of the non-seamless ACR feature changes when you connect to a Db2 for z/OS data sharing group for which you configure an alternate group and you enable the Sysplex workload balancing feature:
  • The CLI driver does not associate a transport with the connection until the application issues an SQL statement or a SET statement. When the application issues an SQL statement or a SET statement, the CLI driver allocates a transport and sets special registers to the values that were in effect at the time of the last commit point.
  • The SQL30108N error with reason code 2 is returned to the application if the CLI driver fails to reconnect to members of the primary group and must switch to the alternate group. The error is returned a second time with reason code 4 if you specify the alternate group in the db2dsdriver.cfg file by using the alternategroup keyword and the enableAlternateGroupSeamlessAcr keyword is set to FALSE. The SQL30108N error with reason code 2 is returned when the existing connection to a member in the current group fails. The SQL30108N error with reason code 4 is returned when all the connection attempts to all members in the existing primary group fail. The application can then resubmit the SET statement or the SQL statement again for the second time if reconnection to the alternate group is warranted. The CLI driver tracks the failed member on the same connection handle when the SQL30108N error is returned to avoid resubmitting the statement to the failed member.
    Attention: The SQL30108N error is not returned twice in the following scenarios:
    • When you use the Db2 Connect server as a gateway
    • When you enable the ACR feature without enabling the Sysplex workload balancing feature
When connecting to a Db2 for z/OS data sharing group, you should not disable the seamless ACR feature and the Sysplex workload balancing feature unless directed by IBM Support.

Non-seamless ACR operation

The following example demonstrates non-seamless ACR operation:
  1. As part of the response to a commit request from the client, the data server returns the SET statements (special registers) that the client replays when the transport is associated at the next SQL statement submission or the SET statement submission based on the WLB routing decisions.
  2. If the SQL statement in a transaction fails, the ACR connection error (SQL30108N, with reason code 2) is returned to the application, but the CLI driver does not attempt to allocate a new transport. Any session resources that were created in the failed connection are dropped. The session resources include the following ones:
    • Open cursors with cursor-hold behavior
    • Open cursors with locators
    • Declared global temporary tables
    • Accelerated queries
  3. If the application submits the SET statement or the SQL statement, the CLI driver attempts to obtain a new transport to connect to the next available member in the same group.
  4. If the enableAlternateGroupSeamlessACR keyword is set to FALSE and the CLI driver cannot reconnect to any of the members in the primary group, a second ACR connection error (SQL30108N with reason code 4) is returned to the application.
  5. If the application submits the SET statement or the SQL statement again for the second time, the CLI driver attempts to obtain a new transport to connect to the next available member in the alternate group.
  6. When the CLI driver successfully reconnects to a new member, the SET statements (special registers) that were returned from the last commit point are replayed, followed by the SET statement or SQL statement that was submitted by the application. You must re-create any required session resources that existed in a previous failed connection. If the reconnection is not successful, a communication error (SQL30081N) is returned to the application.

In the non-seamless ACR environment, the receive timeout event triggers the ACR connection error (SQL30108N). The receive timeout event occurs when the ReceiveTimeout keyword value is reached

If you set the QueryTimeout keyword in the non-seamless ACR environment, the following behaviors occur:
  • If the connection failure occurs before the query timeout event, the ACR connection error (SQL30108N, with reason code 2 or 4 and failure code 1, 2, or 3) is returned to the application.
  • If the Interrupt keyword is set to the value of 2 and the query timeout event occurs, the ACR connection error (SQL30108N, with failure code 4 and error code 1) is returned to the application.

When the Interrupt keyword is set to the default value of 2 and the SQLCancel() API is explicitly called from the application while an SQL statement is being executed, the ACR connection error (SQL30108N, with failure code 4 and error code 2) is returned to the application.