[z/OS]

Recovery manager messages (CSQR...)

CSQR001I
RESTART INITIATED
Explanation

This message delimits the beginning of the restart process within startup. The phases of restart are about to begin. These phases are necessary to restore the operational environment to that which existed at the time of the previous termination and to perform any recovery actions that might be necessary to return IBM® MQ-managed resources to a consistent state.

CSQR002I
RESTART COMPLETED
Explanation

This message delimits the completion of the restart process within startup.

System action

Startup continues.

CSQR003I
RESTART - PRIOR CHECKPOINT RBA=rba
Explanation

The message indicates the first phase of the restart process is in progress and identifies the log positioning RBA of the checkpoint from which the restart process will obtain its initial recovery information.

System action

Restart processing continues.

CSQR004I
RESTART - UR COUNTS - IN COMMIT=nnnn, INDOUBT=nnnn, INFLIGHT=nnnn, IN BACKOUT=nnnn
Explanation

This message indicates the completion of the first phase of the restart process. The counts indicate the number of units of recovery with an execution state during a previous queue manager termination that indicates (to ensure MQ resource consistency) some recovery action must be performed during this restart process. The counts might provide an indication of the time required to perform the remaining two phases of restart (forward and backward recovery).

The IN COMMIT count specifies the number that had started, but not completed, phase-2 of the commit process. These must undergo forward recovery to complete the commit process.

The INDOUBT count specifies the number that were interrupted between phase-1 and phase-2 of the commit process. These must undergo forward recovery to ensure that resources modified by them are unavailable until their INDOUBT status is resolved.

The INFLIGHT count specifies the number that neither completed phase-1 of the commit process nor began the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

The IN BACKOUT count specifies the number that were in the process of backing out. These must undergo backward recovery to restore resources modified by them to their previous consistent state.

System action

Restart processing continues.

CSQR005I
RESTART - FORWARD RECOVERY COMPLETE - IN COMMIT= nnnn, INDOUBT=nnnn
Explanation

The message indicates the completion of the forward recovery restart phase. The counts indicate the number of units of recovery with recovery actions that could not be completed during the phase. Typically, those in an IN COMMIT state remain because the recovery actions of some subcomponents have not been completed. Those units of recovery in an INDOUBT state will remain until connection is made with the subsystem that acts as their commit coordinator.

System action

Restart processing continues.

CSQR006I
RESTART - BACKWARD RECOVERY COMPLETE - INFLIGHT= nnnn, IN BACKOUT=nnnn
Explanation

The message indicates the completion of the backward recovery restart phase. The counts indicate the number of units of recovery with recovery actions that could not be completed during the phase. Typically, those in either state remain because the recovery actions of some subcomponents have not been completed.

System action

Restart processing continues.

CSQR007I
UR STATUS
Explanation

This message precedes a table showing the status of units of recovery (URs) after each restart phase. The message and the table will accompany the CSQR004I, CSQR005I, or CSQR006I message after each nested phase. At the end of the first phase, it shows the status of any URs that require processing. At the end of the second (forward recovery) and third (backout) phases, it shows the status of only those URs which needed processing but were not processed. The table helps to identify the URs that were active when the queue manager stopped, and to determine the log scope required to restart.

The format of the table is:

  T  CON-ID     THREAD-XREF     S   URID     TIME 
The columns contain the following information:
T
Connection type. The values can be:
B
Batch: From an application using a batch connection
R
RRS: From an RRS-coordinated application using a batch connection
C
CICS®: From CICS
I
IMS: From IMS
S
System: From an internal function of the queue manager or from the channel initiator.
CON-ID
Connection identifier for related URs. Batch connections are not related to any other connection. Subsystem connections with the same identifier indicate URs that originated from the same subsystem.
THREAD-XREF
The recovery thread cross-reference identifier associated with the thread; see Connecting from the IMS control region for more information.
S
Restart status of the UR. When the queue manager stopped, the UR was in one of these situations:
B
INBACKOUT: the UR was in the must-complete phase of backout, and is yet to be completed
C
INCOMMIT: the UR was in the must-complete phase of commit, and is yet to be completed
D
INDOUBT: the UR had completed the first phase of commit, but IBM MQ had not received the second phase instruction (the UR must be remembered so that it can be resolved when the owning subsystem reattaches)
F
INFLIGHT: the UR had not completed the first phase of commit, and will be backed out.
URID
UR identifier, the log RBA of the beginning of this unit of recovery. It is the earliest RBA required to process the UR during restart.
TIME
The time the UR was created, in the format yyyymmdd hhmmss. It is approximately the time of the first IBM MQ API call of the application or the first IBM MQ API call following a commit point.
CSQR009E
NO STORAGE FOR UR STATUS TABLE, SIZE REQUESTED= xxxx, REASON CODE=yyyyyyyy
Explanation

There was not enough storage available during the creation of the recoverable UR (unit of recovery) display table.

System action

Restart continues but the status table is not displayed.

System programmer response

Increase the region size of the xxxxMSTR region before restarting the queue manager.

CSQR010E
ERROR IN UR STATUS TABLE SORT/TRANSLATE, ERROR LOCATION CODE=xxxx
Explanation

An internal error has occurred.

System action

Restart continues but the status table is not displayed.

System programmer response

Note the error code in the message and contact your IBM support center.

CSQR011E
ERROR IN UR STATUS TABLE DISPLAY, ERROR LOCATION CODE=xxxx
Explanation

An internal error has occurred.

System action

Restart continues but the status table is not displayed.

System programmer response

Note the error code in the message and contact your IBM support center.

CSQR015E
CONDITIONAL RESTART CHECKPOINT RBA rba NOT FOUND
Explanation

The checkpoint RBA in the conditional restart control record, which is deduced from the end RBA or LRSN value that was specified, is not available. This is probably because the log data sets available for use at restart do not include that end RBA or LRSN.

System action

Restart ends abnormally with reason code X'00D99001' and the queue manager terminates.

System programmer response

Run the change log inventory utility (CSQJU003) specifying an ENDRBA or ENDLRSN value on the CRESTART control statement that is in the log data sets that are to be used for restarting the queue manager.

CSQR020I
OLD UOW FOUND
Explanation

During restart, a unit of work was found that predates the oldest active log. Information about the unit of work is displayed in a table in the same format as in message CSQR007I.

Old units of work can lead to extended restart times, as restart processing need to read archive logs to correctly process the unit of work. IBM MQ offers the opportunity to avoid this delay by allowing old units of work to be force committed.
Note: Force committing a unit of work can break the transactional integrity of updates between IBM MQ, and other resource managers involved in the original unit of work described in this message.
System action

Message CSQR021D is issued and the operator's reply is awaited.

CSQR021D
REPLY Y TO COMMIT OR N TO CONTINUE
Explanation

An old unit of work was found, as indicated in the preceding CSQR020I message.

System action

The queue manager waits for the operator's reply.

CSQR022I
OLD UOW COMMITTED, URID=urid
Explanation

This message is sent if the operator answers 'Y' to message CSQR021D.

System action

The indicated unit of work is committed.

CSQR023I
OLD UOW UNCHANGED, URID=urid
Explanation

This message is sent if the operator answers 'N' to message CSQR021D.

CSQR023I is also sent when an old unit of work which is already in the 'in-backout' state is identified. Units of work in the 'in-backout' state are ineligible for force commit processing as it can lead to a queue becoming unusable. For such units of work, the message CSQR021D is not issued, and no choice is possible.

System action

The indicated unit of work is left for handling by the normal restart recovery process.

CSQR026I
Long-running UOW shunted to RBA=rba, URID=urid connection name=name
Explanation

During checkpoint processing, an uncommitted unit of recovery was encountered that has been active for at least 3 checkpoints. The associated log records have been rewritten ('shunted') to a later point in the log, at RBA rba. The unit of recovery identifier urid together with the connection name name identify the associated thread.

System action

Processing continues.

System programmer response

Uncommitted units of recovery can lead to difficulties later, so consult with the application programmer to determine if there is a problem that is preventing the unit of recovery from being committed, and to ensure that the application commits work frequently enough.

CSQR027I
Long-running UOW shunting failed, URID=urid connection name=name
Explanation

During checkpoint processing, an uncommitted unit of recovery was encountered that has been active for at least 3 checkpoints. However, the associated log records could not be rewritten ('shunted') to a later point in the log. The unit of recovery identifier urid together with the connection name name identify the associated thread.

System action

The unit of recovery is not shunted, and will not participate in any future log shunting.

System programmer response

The most likely cause is insufficient active log data sets being available, in which case you should add more log data sets for the queue manager to use. Use the DISPLAY LOG command or the print log map utility (CSQJU004) to determine how many log data sets there are and what their status is.

Uncommitted units of recovery can lead to difficulties later, so consult with the application programmer to determine if there is a problem that is preventing the unit of recovery from being committed, and to ensure that the application commits work frequently enough.

CSQR029I
INVALID RESPONSE - NOT Y OR N
Explanation

The operator did not respond correctly to the reply message CSQR021D. Either 'Y' or 'N' must be entered.

System action

The original message is repeated.

CSQR030I
Forward recovery log range from RBA=from-rba to RBA=to-rba
Explanation

This indicates the log range that must be read to perform forward recovery during restart.

System action

Restart processing continues.

CSQR031I
Reading log forwards, RBA=rba
Explanation

This is issued periodically during restart recovery processing to show the progress of the forward recovery phase and the current status rebuild phase. For the forward recovery phase the log range that needs to be read is shown in the preceding CSQR030I message.

For the current status rebuild phase, the starting log RBA is shown in the preceding CSQR003I message and the end log RBA is shown in the preceding CSQJ099I message. The RBA represents the position in the recovery log during the forward recovery phase of current status rebuild.

System action

Restart processing continues.

CSQR032I
Backward recovery log range from RBA=from-rba to RBA=to-rba
Explanation

This indicates the log range that must be read to perform backward recovery during restart.

System action

Restart processing continues.

CSQR033I
Reading log backwards, RBA=rba
Explanation

This is issued periodically during restart recovery processing to show the progress of the backward recovery phase. The log range that needs to be read is shown in the preceding CSQR032I message.

System action

Restart processing continues.

CSQR034I
Backward migration detected
Explanation

During queue manager restart it has been detected that one or more of the page sets that have been connected has been used at a higher version of queue manager code.

System action

The queue manager will automatically perform special processing during restart to alter any messages stored on those page sets so they can be read by the current version of the queue manager. This special processing is dependent on there being no unresolved units of work found at the end of restart, so you might be prompted by way of further messages during restart to force commit these.

Restart processing continues.