Investigating loops that cause transactions to abend with abend code AICA

If the loop causes a transaction to abend with abend code AICA, it must either be a tight loop or a non-yielding loop. You do not need to find which type you have, although this is likely to be revealed to you when you do your investigation.

About this task

Both a tight loop and a non-yielding loop are characterized by being confined to a single user program. You should know the identity of the transaction to which the program belongs, because it is the transaction that abended with code AICA when the runaway task was detected.

Procedure

Get the documentation you need.
Look at the evidence.
Identify the loop, using information from the trace table and transaction dump.
Determine the reason for the loop.

Step 1. Get the documentation you need

When investigating loops that cause transactions to abend AICA, you need the CICS® system dump accompanying the abend. System dumping must be enabled for dump code AICA.

You can use the system dump to find out the following:

Whether the loop is in your user code or in CICS code
If the loop is in your user code, the point at which the loop was entered.

It is also useful to have trace running, as trace can help you to identify the point in your program where looping started. If you have a non-yielding loop, it can probably also show you some instructions in the loop.

A tight loop is unlikely to contain many instructions, and you might be able to capture all the evidence you need from the record of events in the internal trace table. A non-yielding loop may contain more instructions, depending on the EXEC CICS commands it contains, but you might still be able to capture the evidence you need from the record of events in the internal trace table. If you find that it is not big enough, direct tracing to the auxiliary trace destination instead.

You need to trace CICS system activity selectively, to ensure that most of the data you obtain is relevant to the problem. Set up the tracing like this:
1. Select level 1 special tracing for AP domain, and for the EXEC interface program (EI).
2. Select special tracing for just the task that has the loop, and disable tracing for all other tasks by turning the main system trace flag off.
You can find guidance about setting up these tracing options in Using CICS trace for problem determination.
Start the task, and wait until it abends AICA.
Format the CICS system dump with formatting keywords KE and TR, to get the kernel storage areas and the internal trace table. See Formatting system dumps.

Results: You now have the documentation you need to find the loop.

Step 2. Look at the evidence

After you have collected the necessary documentation, use the following guidance to analyze the information you have gathered.

Look first at the kernel task summary. The runaway task is flagged “*YES*” in the ERROR column. The status of the task is shown as “***Running**”.
Use the kernel task number for the looping task to find its linkage stack.
- If a user task is looping, DFHAPLI, a transaction manager program, should be near the top of the stack. You are likely to find other CICS modules at the top of the stack that have been invoked in response to the abend. For example, those associated with taking the dump.
- If you find any program or subroutine above DFHAPLI that has not been invoked in response to the error, it is possible that CICS code, or the code of another program, has been looping.

Results:

If you find that the loop is within CICS code, you need to contact IBM Support. Make sure you keep the dump, because the Support Center staff need it to investigate the problem.

If the kernel linkage stack entries suggest that the loop is in your user program, you next need to identify the loop.

Step 3. Identify the loop

To identify loops in user programs, you can look in the transaction dump, or you can use the trace table.

To identify a loop by using the transaction dump:

Find the program status word (PSW), and see whether it points into your program.
This is likely to be the case if you have a tight loop, and it should lead you to an instruction within the loop.
Use the module index at the end of the formatted dump to find the module name of the next instruction.
If the instruction address is not in your code, it is less useful for locating the loop. However, try to identify the module that contains the instruction, because it is probably the one that was called during the execution of a CICS request made within the loop. If the PSW address is not contained in one of these areas, another program was probably executing on behalf of CICS when the runaway task timer expired.
Note: It is possible that the loop is in a module owned by CICS or another product, and your program is not responsible for it. If the loop is in CICS code, contact IBM Support.
If the PSW points to a module outside your application program, find the address of the return point in your program from the contents of register 14 in the appropriate register save area.
The return address will lie within the loop, if the loop is not confined to system code.
When you have located a point within the loop, work through the source code and try to find the limits of the loop.

To identify a loop by using the trace table:

Go to the last entry in the internal trace table, and work backwards until you get to an entry for point ID AP 1942.
The trace entry should have been made when recovery was entered after the transaction abended AICA.
Make a note of the task number, so you can check that any other trace entries you read relate to the same abended task.
Look at the entries preceding AP 1942. In particular, look for trace entries with the point ID AP 00E1.
These entries should have been made either just before the loop was entered (for a tight loop), or within the loop itself (for a non-yielding loop). Entries with a point ID of AP 00E1 are made on entry to the EXEC interface program (DFHEIP) whenever your program issues an EXEC CICS command, and again on exit from the EXEC interface program. Field B gives you the value of EIBFN, which identifies the specific command that was issued.
When you have identified the value of EIBFN, use the function code list in Function codes of EXEC CICS commands to identify the command that was issued.
For trace entries made on exit from DFHEIP, field A gives you the response code from the request. Look carefully at any response codes - they could provide the clue to the loop.
Has the program been designed to deal with every possible response from DFHEIP? Could the response code you see explain the loop?

Results:

If you see a repeating pattern of trace points for AP 00E1, you have a non-yielding loop. If you can match the repeating pattern to statements in the source code for your program, you have identified the limits of the loop.

If you see no repeating pattern of trace points for AP 00E1, it is likely that you have a tight loop. The last entry for AP 00E1 (if there is one) should have been made from a point just before the program entered the loop. You might be able to recognize the point in the program where the request was made, by matching trace entries with the source code of the program.

Step 4. Find the reason for the loop

When you have identified the limits of the loop, you need to find the reason why the loop occurred.

Assuming you have the trace, and EI level 1 tracing has been done, ensure that you can explain why each EIP entry is there. Verify that the responses are as expected.

A good place to look for clues to loops is immediately before the loop sequence, the first time it is entered. Occasionally, a request that results in an unexpected return code can trigger a loop. However, you usually can only see the last entry before the loop if you have CICS auxiliary or GTF trace running, because the internal trace table is likely to wrap before the AICA abend occurs.