IBM Support

Long-Running Jobs, Jobs Processing Lots of Data, Prestart Jobs, and Large Mark Problems

Troubleshooting


Problem

This document provides information about a problem where jobs can intermittently end with message CPC1220 showing that message MCH3203 occurred. Other messages such as MCH1210 and MCH5601 can be received.

Resolving The Problem

In addition to the original problem of long-running jobs that use the ILE environment and end with messages CPC1210 and MCH3203 or MCH4422, general information regarding large mark problems is documented here. Large mark problems occur when an activation group mark, activation mark, or invocation mark exceeds the 4-byte limit. This limit was expanded to 8-bytes at the Machine Interface in R530.

Notes:
1 Activation and Invocation marks were expanded to 8-bytes throughout the system in R530.
Situation: The following symptoms appear:
o The job ends abnormally with CPC1220 F/QWTPITPP T/*EXT; the message text is: Job ended abnormally because of error code MCH3203.
o The job produces the following VLOGs: 02000606 (invocation work area is full) and 02000307 (unhandled exception detected).
o The above VLOGs contain a large number of entries that look similar to the following:

ISF ADDRESS   FE6DA4589B 88A0C0       TYPE MI PROC
    RESUMEPOINT     1A027BC68F 004608     ENTRY@    1A027BC68F 0044D0
     PROGRAM  QLEDAGE
     MODULE   QLEIT
     ENTRY    ACT_CREATE__FUI + 00000138
The error is due to an overflow in the activation mark counter for the job where the 4-byte mark value is being utilized rather than the 8-byte mark value. As of R530, there are separate mark counters for activation group marks, activation marks, and invocation marks. The mark values that they assign are 8 bytes long.

There are separate mark counters for activation group marks, activation marks, and invocation marks. The mark values that they assign are 8 bytes long.

The activation group mark counter is process scoped. It is used to uniquely identify each activation group created within a process (job). It grows throughout the life of the process.

The activation mark counter is process scoped. It is used to uniquely identify each and every activation object for programs and service programs that are (re)activated within various activation groups in a process. It grows throughout the life of the process.*** When an activation group is destroyed, the activation objects associated with it are destroyed. The activation group and activation mark counters do not decrement at destroy time. This guarantees uniqueness throughout the life of the process.

The invocation mark counter is thread scoped. It is used to uniquely identify MI invocations in the thread's call stack, as needed. In general, invocation marks are used for messages and exceptions. The invocation mark counter grows throughout the life of the thread.

*** Note: When a job is restarted, then all mark counters are reset back to their starting values. This restart means the process terminates, and a new process is initiated. (When prestart jobs are recycled for reuse, the activation group and activation mark counters are not reset, nor is the initial thread's invocation mark counter reset. Recycled prestart jobs do not terminate the process and initiate a new on.)

This scenario will typically appear in very large, long-running jobs, typically server jobs or batch jobs processing large amounts of data. Strategy to avoid or work around this problem::
1. Change to use APIs and MI Instructions that output and input 8-byte marks.
2. Reduce usage of activation marks to ensure the counter does not overflow
o Avoid running programs in activation group *NEW unless there is a functional reason to do so.
o Avoid deactivating/reactivating programs unnecessarily (for example, in ILE RPG turn the LR indicator on only when it is really necessary; otherwise, use RETURN).
o Split up data processing into multiple smaller jobs, if viable.
o End/restart server jobs as needed.

Displaying the Marks for an Active Job

First, you need to determine the TDE (task dispatching element) address for the job:

1. STRSST and sign in                                                  
2. Option 1. Start a service tool                                      
3. Option 4. Display/Alter/Dump                                        
4. Option 1. Display/Alter storage                                      
5. Option 2. Licensed Internal Code (LIC) data                          
6. Option 14. Advanced analysis                                        
7. Enter the following data:                                            
        LEVELOFPROTECTION                                              
        SMARTCHAIN                                                      
        SPINLOCKTRACE                                                  
        SSDSANITIZE                                                    
        SSLCONFIG                                                      
        SYNCTOKENINFO                                                  
        SYSTEMHEAPS                                                    
 1      TASKINFO                                                        
        TCPINFO                                                        
        TCPSECUREFIX                                                    
        TELNET                                                          
        USERNODALHEAPS                                                  
Press Enter                
8. Enter the following data the job number in question instead of the X's:
Options . . . . .   -names XXXXXX                                      
Press Enter  

You'll see something like:

Task     88: TDE=B077F0000FAAC000 TaskName=MITHREAD QPADEV0012MAUSER  
116833                                                                  
0:00.006 WaitObj=EC6E0AC0D10003D0 QCo-QuCounter                        
WaitObjCaller=FFFFFFFFC213B6F8                                          
                                                                       
9. Press F3=Exit                      

Make note of the TDE address (it will look similar to the one highlighted above), it will be used below.

----------------------------------------------

ACTIVATIONINFO                                                          
                                                                       
1. STRSST and sign in                                                  
2. Option 1. Start a service tool                                      
3. Option 4. Display/Alter/Dump                                        
4. Option 2. Dump to printer                                            
5. Option 2. Licensed Internal Code (LIC) data                          
6. Option 14. Advanced analysis                                        
7. Enter the following data:                                            
 1      ACTIVATIONINFO                                                  
Press Enter                                                            
8. Enter the following data:                                            
Options . . . . .   -task XXXXXX  **REPLACE X's with your TDE collected
above                                                                  
Press Enter                                                            
9. Press Enter                                                          
10. Press F3=Exit                                                      
11. Press F3=Exit  

----------------------------------------------

The spool file will contain the activation information for this job. The second column has the marks for the programs/service programs activated in the job.


About Marks: SQL and Prestart Jobs

Problems seen in a job after the mark counter is generating large marks vary and depends on what mark-concious code runs after the mark values have grown too large. Problems do not occur until a too large mark value is used. When SQL code uses too large a mark value an MCH1210 occurs. Because MCH1210 is a generic exception, it does not necessarily indicate a large mark problem. The code generating the MCH1210 must be checked to verify that a mark value is being assigned to the receiver. The exceptions listed below have been related to large mark problems:
CPF2508 Cannot move messages to the same or later call stack.
MCH5601 Template value not valid for instruction.
MCH4422 Program activation not found.
CPC1220 Job ended abnormally because of error code MCH3203.
MCH1210 f/QSQxxxxxx
MCH1210 f/CBLABRANCH t/QSQxxxxxx, where QSQxxxxxx could be any QSQ part that is using an activation mark, an activation group mark, or an invocation mark as a signed integer at the point of failure. Each MCH1210 must be checked to verify that a mark value caused the exception because MCH1210 is a generic overflow exception.
MCH4419 Activation group not found. Jobs take an unusually long time to end due to large marks affecting activation group cleanup.
MCH4419 F/AIProcess X/000930 T/QDMCOPEN X/04F0.
MCH4419 F/AiProcessX/000930 T/QLEADGE X/*STMT
11 srcB5000105
12 src6D200061
13 srcB6000650
14 srcB6000650
Prestart jobs are also subject to large mark problems. Prestart jobs that are used for SQL (for example, QZDASOINIT) will be more likely to get an MCH1210 exception in QSQxxxxx code. The difficulty with prestart jobs is that when they are recycled for reuse, LIC cleanup does not occur. This means that the mark counter for recycled jobs does not get reset.

In general, the work around strategy includes one or more of the following:
1. Change to use APIs and MI Instructions that output and input 64-bit marks.
2. Reduce usage of activation marks to ensure the counter does not overflow
o Monitor the job's mark counter.
o For long-running jobs, end and start a new job.
o For jobs processing a large amount of data, break up the work into multiple jobs.
o Do activation group creation and activation at the beginning of the job.
o Use the user state, default activation group, and named activation groups rather than *NEW activation groups.
o Avoid reclaiming activation groups if they must be re-created before the job ends.

[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHjAAM","label":"Job and Work Management"},{"code":"a8m0z0000000CHtAAM","label":"Programming ILE Languages"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions"}]

Historical Number

425200816

Document Information

Modified date:
25 November 2024

UID

nas8N1014796