Recovery manager messages (CSQR...)

CSQR001I: RESTART INITIATED
Explanation: This message delimits the beginning of the restart process within startup. The phases of restart are about to begin. These phases are necessary to restore the operational environment to that which existed at the time of the previous termination and to perform any recovery actions that might be necessary to return IBM® MQ-managed resources to a consistent state.

CSQR002I: RESTART COMPLETED
Explanation: This message delimits the completion of the restart process within startup.
System action: Startup continues.

CSQR003I: RESTART - PRIOR CHECKPOINT RBA=rba
Explanation: The message indicates the first phase of the restart process is in progress and identifies the log positioning RBA of the checkpoint from which the restart process will obtain its initial recovery information.
System action: Restart processing continues.

CSQR004I

RESTART - UR COUNTS - IN COMMIT=nnnn, INDOUBT=nnnn, INFLIGHT=nnnn, IN BACKOUT=nnnn

Explanation

This message indicates the completion of the first phase of the restart process. The counts indicate the number of units of recovery with an execution state during a previous queue manager termination that indicates (to ensure MQ resource consistency) some recovery action must be performed during this restart process. The counts might provide an indication of the time required to perform the remaining two phases of restart (forward and backward recovery).

The IN COMMIT count specifies the number that had started, but not completed, phase-2 of the commit process. These must undergo forward recovery to complete the commit process.

The INDOUBT count specifies the number that were interrupted between phase-1 and phase-2 of the commit process. These must undergo forward recovery to ensure that resources modified by them are unavailable until their INDOUBT status is resolved.

The INFLIGHT count specifies the number that neither completed phase-1 of the commit process nor began the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

The IN BACKOUT count specifies the number that were in the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

System action

Restart processing continues.

CSQR005I: RESTART - FORWARD RECOVERY COMPLETE - IN COMMIT= nnnn, INDOUBT=nnnn
Explanation: The message indicates the completion of the forward recovery restart phase. The counts indicate the number of units of recovery with recovery actions that could not be completed during the phase. Typically, those in an IN COMMIT state remain because the recovery actions of some subcomponents have not been completed. Those units of recovery in an INDOUBT state will remain until connection is made with the subsystem that acts as their commit coordinator.
System action: Restart processing continues.

CSQR006I: RESTART - BACKWARD RECOVERY COMPLETE - INFLIGHT= nnnn, IN BACKOUT=nnnn
Explanation: The message indicates the completion of the backward recovery restart phase. The counts indicate the number of units of recovery with recovery actions that could not be completed during the phase. Typically, those in either state remain because the recovery actions of some subcomponents have not been completed.
System action: Restart processing continues.

CSQR007I

UR STATUS

Explanation

This message precedes a table showing the status of units of recovery (URs) after each restart phase. The message and the table will accompany the CSQR004I, CSQR005I, or CSQR006I message after each nested phase. At the end of the first phase, it shows the status of any URs that require processing. At the end of the second (forward recovery) and third (backout) phases, it shows the status of only those URs which needed processing but were not processed. The table helps to identify the URs that were active when the queue manager stopped, and to determine the log scope required to restart.

The format of the table is:


  T  CON-ID     THREAD-XREF     S   URID     TIME

The columns contain the following information:

T

Connection type. The values can be:

B: Batch: From an application using a batch connection
R: RRS: From an RRS-coordinated application using a batch connection
C: CICS®: From CICS
I: IMS: From IMS
S: System: From an internal function of the queue manager or from the channel initiator.

CON-ID

Connection identifier for related URs. Batch connections are not related to any other connection. Subsystem connections with the same identifier indicate URs that originated from the same subsystem.

THREAD-XREF

The recovery thread cross-reference identifier associated with the thread; see Connecting from the IMS control region for more information.

S

Restart status of the UR. When the queue manager stopped, the UR was in one of these situations:

B: INBACKOUT: the UR was in the must-complete phase of backout, and is yet to be completed
C: INCOMMIT: the UR was in the must-complete phase of commit, and is yet to be completed
D: INDOUBT: the UR had completed the first phase of commit, but IBM MQ had not received the second phase instruction (the UR must be remembered so that it can be resolved when the owning subsystem reattaches)
F: INFLIGHT: the UR had not completed the first phase of commit, and will be backed out.

URID

UR identifier, the log RBA of the beginning of this unit of recovery. It is the earliest RBA required to process the UR during restart.

TIME

The time the UR was created, in the format yyyymmdd hhmmss. It is approximately the time of the first IBM MQ API call of the application or the first IBM MQ API call following a commit point.

CSQR009E: NO STORAGE FOR UR STATUS TABLE, SIZE REQUESTED= xxxx, REASON CODE=yyyyyyyy
Explanation: There was not enough storage available during the creation of the recoverable UR (unit of recovery) display table.
System action: Restart continues but the status table is not displayed.
System programmer response: Increase the region size of the xxxxMSTR region before restarting the queue manager.

CSQR010E: ERROR IN UR STATUS TABLE SORT/TRANSLATE, ERROR LOCATION CODE=xxxx
Explanation: An internal error has occurred.
System action: Restart continues but the status table is not displayed.
System programmer response: Note the error code in the message and contact your IBM support center.

CSQR011E: ERROR IN UR STATUS TABLE DISPLAY, ERROR LOCATION CODE=xxxx
Explanation: An internal error has occurred.
System action: Restart continues but the status table is not displayed.
System programmer response: Note the error code in the message and contact your IBM support center.

CSQR015E: CONDITIONAL RESTART CHECKPOINT RBA rba NOT FOUND
Explanation: The checkpoint RBA in the conditional restart control record, which is deduced from the end RBA or LRSN value that was specified, is not available. This is probably because the log data sets available for use at restart do not include that end RBA or LRSN.
System action: Restart ends abnormally with reason code X'00D99001' and the queue manager terminates.
System programmer response: Run the change log inventory utility (CSQJU003) specifying an ENDRBA or ENDLRSN value on the CRESTART control statement that is in the log data sets that are to be used for restarting the queue manager.

CSQR020I: OLD UOW FOUND
Explanation: During restart, a unit of work was found that predates the oldest active log. Information about the unit of work is displayed in a table in the same format as in message CSQR007I.

Old units of work can lead to extended restart times, as restart processing need to read archive logs to correctly process the unit of work. IBM MQ offers the opportunity to avoid this delay by allowing old units of work to be force committed.
Note: Force committing a unit of work can break the transactional integrity of updates between IBM MQ, and other resource managers involved in the original unit of work described in this message.
System action: Message CSQR021D is issued and the operator's reply is awaited.

CSQR021D: REPLY Y TO COMMIT OR N TO CONTINUE
Explanation: An old unit of work was found, as indicated in the preceding CSQR020I message.
System action: The queue manager waits for the operator's reply.

CSQR022I: OLD UOW COMMITTED, URID=urid
Explanation: This message is sent if the operator answers 'Y' to message CSQR021D.
System action: The indicated unit of work is committed.

CSQR023I

OLD UOW UNCHANGED, URID=urid

Explanation

This message is sent if the operator answers 'N' to message CSQR021D.

CSQR023I is also sent when an old unit of work which is already in the 'in-backout' state is identified. Units of work in the 'in-backout' state are ineligible for force commit processing as it can lead to a queue becoming unusable. For such units of work, the message CSQR021D is not issued, and no choice is possible.

System action

The indicated unit of work is left for handling by the normal restart recovery process.

CSQR026I: Long-running UOW shunted to RBA=rba, URID=urid connection name=name
Explanation: During checkpoint processing, an uncommitted unit of recovery was encountered that has been active for at least 3 checkpoints. The associated log records have been rewritten ('shunted') to a later point in the log, at RBA rba. The unit of recovery identifier urid together with the connection name name identify the associated thread.
System action: Processing continues.
System programmer response: Uncommitted units of recovery can lead to difficulties later, so consult with the application programmer to determine if there is a problem that is preventing the unit of recovery from being committed, and to ensure that the application commits work frequently enough.

CSQR027I

Long-running UOW shunting failed, URID=urid connection name=name

Explanation

During checkpoint processing, an uncommitted unit of recovery was encountered that has been active for at least 3 checkpoints. However, the associated log records could not be rewritten ('shunted') to a later point in the log. The unit of recovery identifier urid together with the connection name name identify the associated thread.

System action

The unit of recovery is not shunted, and will not participate in any future log shunting.

System programmer response

The most likely cause is insufficient active log data sets being available, in which case you should add more log data sets for the queue manager to use. Use the DISPLAY LOG command or the print log map utility (CSQJU004) to determine how many log data sets there are and what their status is.

Uncommitted units of recovery can lead to difficulties later, so consult with the application programmer to determine if there is a problem that is preventing the unit of recovery from being committed, and to ensure that the application commits work frequently enough.

CSQR029I: INVALID RESPONSE - NOT Y OR N
Explanation: The operator did not respond correctly to the reply message CSQR021D. Either 'Y' or 'N' must be entered.
System action: The original message is repeated.

CSQR030I: Forward recovery log range from RBA=from-rba to RBA=to-rba
Explanation: This indicates the log range that must be read to perform forward recovery during restart.
System action: Restart processing continues.

CSQR031I

Reading log forwards, RBA=rba

Explanation

This is issued periodically during restart recovery processing to show the progress of the forward recovery phase and the current status rebuild phase. For the forward recovery phase the log range that needs to be read is shown in the preceding CSQR030I message.

For the current status rebuild phase, the starting log RBA is shown in the preceding CSQR003I message and the end log RBA is shown in the preceding CSQJ099I message. The RBA represents the position in the recovery log during the forward recovery phase of current status rebuild.

System action

Restart processing continues.

CSQR032I: Backward recovery log range from RBA=from-rba to RBA=to-rba
Explanation: This indicates the log range that must be read to perform backward recovery during restart.
System action: Restart processing continues.

CSQR033I: Reading log backwards, RBA=rba
Explanation: This is issued periodically during restart recovery processing to show the progress of the backward recovery phase. The log range that needs to be read is shown in the preceding CSQR032I message.
System action: Restart processing continues.

CSQR034I

Backward migration detected

Explanation

During queue manager restart it has been detected that one or more of the page sets that have been connected has been used at a higher version of queue manager code.

System action

The queue manager will automatically perform special processing during restart to alter any messages stored on those page sets so they can be read by the current version of the queue manager. This special processing is dependent on there being no unresolved units of work found at the end of restart, so you might be prompted by way of further messages during restart to force commit these.

Restart processing continues.