IBM Support

Diagnosing and Debugging Heap Memory Problems: C2M1211 and C2M1212 messages

Troubleshooting


Problem

This document provides information that can help to diagnose problems indicated by a C2M1211 message or C2M1212 message in the job log.

Resolving The Problem

Diagnosing Memory Heap Problems: C2M1211 and C2M1212 messages

C2M1212 - The pointer parameter passed to free() or realloc() is not valid.
C2M1211 - The ILE C heap control structure has been corrupted.

These messages are often accompanied by a corresponding machine check detailing what caused the C2M1212 or C2M1211 to be flagged.

MCH6902 - The requested heap space operation is invalid.
MCH6903 - The heap space has reached its maximum allowable size.
MCH6906 - Invalid heap space condition detected. Internal dump identifier (ID) &1.

C2M1211 message

A C2M1211 message indicates that a teraspace version of the heap memory manager has detected that the heap control structure has been corrupted. The C2M1211 message can be caused by many things. The most common causes include:

o Freeing a space twice.
o Writing outside the bounds of allocated storage.
o Writing to storage that has been freed.

The CM1211 message often indicates an application heap problem. These problems are often difficult to track down.

C2M1212 message

A C2M1212 message indicates some type of memory problem, which can lead to memory corruption and other issues. The memory corruption could occur within application code or operating system code. The message is only a diagnostic message, but can be an indicator of a real problem. The C2M1212 message might or might not be the source of other problems. Clean up the memory problem if possible. When a C2M1212 message is generated, the hexadecimal value of the pointer passed to the free() function is included as part of the message description. This hexadecimal value can provide clues as to the origin of the problem. The malloc() function returns only pointers that end in hexadecimal 0. Any pointer that does not end in hexadecimal 0 was either never set to point to storage reserved by the malloc() function or was modified since it was set to point to storage reserved by the malloc() function. If the pointer ends in hexadecimal 0, then the cause of the C2M1212 message is uncertain, and the program code that calls free() should be examined. The C2M1212 message is an informational type message that can indicate a problem that will eventually cause the application to fail, but the message itself will not stop the application.

In most cases, a C2M1212 message from a single-level store heap memory manager is preceded by an MCH6902 message. The MCH6902 message has an error code indicating what the problem is. The most common error code is 2, which indicates that memory is being freed which is not currently allocated. This error code could mean one of the following:

o Memory is being freed which has not been allocated.
o Memory is being freed for a second time.

In some cases, a memory leak can cause the single-level store heap to become fragmented to the point that the heap control segment is full and deallocates fail. This problem is indicated by an MCH6906 message. In this case, the only solution is to debug the application and fix the memory leak.

Stack Tracebacks

Stack tracebacks can be used to find the code which is causing the C2M1211/C2M1212. Once the code has been found, we must determine what the problem is with the pointer to the heap storage. This can be a difficult task. There are several potential causes:
1. The pointer was never initialized and contains an unexpected value. The C2M1212 message dumps the hex value of the pointer.
2. The pointer was not obtained from malloc(). Perhaps the pointer is a pointer to an automatic (local) variable or a static (global) variable and not a pointer to heap storage from malloc().
3. The pointer was modified after it was returned from malloc(). For example, if the pointer returned from malloc() was incremented by some amount and then passed to free(), it would be invalid, and a C2M1212 message would be issued.
4. The pointer is being passed a second time to free(). Once free() has been called with the pointer, the space pointed to by that pointer is deallocated. If free() is called against that pointer again, a C2M1212 message is issued.
5. The heap structure maintained by the heap manager to track heap allocations has been corrupted. In this case, the pointer is a valid pointer, but the heap manager cannot determine that. Thus, a C2M1212 message results. When the heap structure is corrupted, there is typically at least one C2M1211 message in the job log to indicate that heap corruption has occurred.

Enabling Automatic Traceback Dumps

PTF SI11014 is required at V5R2M0 before stack tracebacks can be obtained. This functionality is included in the base operating system at V5R3M0 and above. When a C2M1211 or C2M1212 message is generated from a single-level store heap routine, the code checks for a *DTAARA named QGPL/QC2M1211 or QGPL/QC2M1212. If the data area exists, the program stack is dumped. If the data area does not exist, no dump is performed.

CRTDTAARA DTAARA(QGPL/QC2M1212) TYPE(*CHAR) LEN(1)
CRTDTAARA DTAARA(QGPL/QC2M1211) TYPE(*CHAR) LEN(1)

Once the data area is in place, a spool file named QPRINT is created with dump information for every C2M1211 message or C2M1212 message. The spool file is created for the user running the job that gets the message. For example, if the job getting the C2M1211 message or C2M1212 message is a server job or batch job running under userid ABC123, then the spool file is created in the output queue for userid ABC123. Once the spool files containing stack tracebacks are obtained, the data area can be removed, and the tracebacks analyzed.

To disable the dumps, delete the data area(s).

Analysis

The stack tracebacks can be used to find the code that is causing the problem. Here is an example stack traceback:

PROGRAM NAME   PROGRAM LIB   MODULE NAME   MODULE LIB    INST#   PROCEDURE  
QC2UTIL1       QSYS          QC2ALLOC      QBUILDSS1     000000  dump_stack__Fv
QC2UTIL1       QSYS          QC2ALLOC      QBUILDSS1     000000  free        
QYPPRT370      QSYS          DLSCTODF37    QBUILDSS1     000000  __dl__FPv  
FSOSA          ABCSYS        OSAACTS       FSTESTOSA     000000  FS_FinalizeDoc
ABCKRNL        ABCSYS        A2PDFUTILS    ABMOD_8       000000  PRT_EndDoc_Adb
ABCKRNL        ABCSYS        A2PDFUTILS    ABMOD_8       000000  PRT_EndDoc    
ABCKRNL        ABCSYS        A2ENGINE      ABMOD_8       000000  ABCReport_Start
ABCKRNL        ABCSYS        A2ENTRYPNT    ABMOD_8       000000  ABCReport_Run
ABCKRNL        ABCSYS        A2ENTRYPNT    ABMOD_8       000000  ABCReport_Entry
PRINTABC       ABCSYS        RUNBATCH      ABMOD_6       000000  main        
PRINTABC       ABCSYS        RUNBATCH      ABMOD_6       000000  _C_pep
QCMD           QSYS                                      000422

The first line is the header line, which shows the program name, program library, module name, module library, instruction number, procedure name, and statement number.

The first line under the header is always a dump_stack procedure - this procedure is generating the C2M1211 message or C2M1212 message.

The next line is the procedure which is calling the dump_stack procedure - Most of the time this is the free() procedure, but it could be realloc() or something else.

The next line is the __dl__FPv procedure, which is the procedure which handles the C++ delete operator. For C++ code, this procedure is often in the stack - for C code, it is not. The free() and delete functions are library routines which are freeing memory on behalf of the caller. They are not important in determining the source of the memory problem.

The line after the __dl__FPv procedure is where things get interesting. In this example, the procedure is called FS_FinalizeDoc. This code contains the incorrect call to delete (it is deleting an object which has been previously deleted/freed). The owner of that application needs to look at the source code for that procedure at the given statement number to determine what is being deleted/freed.

In some cases, this object is a local object of some type and it is easy to determine the problem. In other cases, the object can be passed to the procedure as a parameter and the caller of that procedure needs to
be examined. In this case, the PRT_EndDoc_Adb procedure is the caller of FS_FinalizeDoc. For this example, the problem is in code within the ABCSYS library.

Operating System and LPP Errors

If the memory issue appears to be caused by operating system code, an IBM licensed program product, or an IBM supported command, the traceback stack(s) and the corresponding job log(s) should be sent to IBM for review.

Application Error Debugging Tips

If the error message or stack tracebacks indicate that there is an issue in a non-IBM application, the traceback stacks do not need to be sent to IBM for review. The stacks can be useful to the application programmer in determining the current state of their application, though. In certain situations, part of the pointer can be overwritten, which invalidates its value.
 
The joblog should be reviewed for additional messages leading up to the heap errors.  If you encounter other indications of memory errors in the job, like an MCH0601 or MCH3601, those should be addressed first.  As the C2M1211/C2M1212 might be a result of that prior corruption:
https://www.ibm.com/support/pages/node/644069

Per the document above:
"The only way to debug this type of problem is to look at where the failure occurs. Then, backtrack starting with all parameters on any CALL, CALLB, CALLP, or EVALs that work with addresses in subprocedures. Review the lengths of the parameters in the called and calling program and any CALLs in these programs. Compile listings are a good place to start. If using subprocedures, review any EVAL statement that is assigning an address."

When these failures start happening in a non-IBM applications, also consider any recent changes in the full application landscape(code changes, data changes, API changes, etc). As again, the application that is showing the message, may not be the root cause of the issue.  For example:
An upstream application creates a malformed message, that when passed downstream, to an unchanged application, starts failing with MCH messages because the downstream application was not designed to handle the malformed message.

Note: If the application programmer is unable to resolve the problem in the application code, Expert Labs can be engaged for a fee.

Additional documentation and tools to consider:
Debug those mysterious problems with your application's memory:
https://www.ibm.com/support/pages/system/files/inline-files/i-mysterious_application-pdf.pdf

QMGTOOLS/STRPSC can be used to collect the PEX trace to review heap allocation.  The trace type would be *USRHEAP.  Here's the document: 
https://www.ibm.com/support/pages/node/6427019
 
IBM iDoctor for IBM i PEX Analyzer also provides tools for analyzing heap traces.  For example, this document provides example steps for identifying a heap leak from a trace:
https://www.ibm.com/support/pages/node/6463463

Debug Memory Manager

In general, most of these memory error messages are caused by application problems. There are some tools included in the ILE C Runtime that can help diagnose the issue. The Debug Memory Manager can help to find incorrect heap usage by an application. Also, the Heap Memory Manager can be used to dump more detailed information about the heap in the stack tracebacks when the C2M1212 or C2M1211 messages are generated by a teraspace heap routine. These functions are documented in section "Runtime Considerations" of the IBM i ILE C/C++ Runtime Library Functions manual, located here:
https://www.ibm.com/docs/en/ssw_ibm_i_76/pdf/sc415607.pdf

The heap memory manager and debug memory manager functionality was introduced in IBM i 7.1.

PEX Tracing

Note: Ensure that current list of PEX PTFs have been loaded prior to tracing.  They can be found here:
https://public.dhe.ibm.com/services/us/igsc/idoctor/html/downloadsV7R4.html

Another way to find the source of memory leaks and other memory problems is to use the Performance Explorer (PEX) and trace heap events. This trace captures all of the heap (malloc/realloc/free) operations. Below are some detailed notes on using PEX for these types of issues. This is not a trivial exercise and can be quite time consuming. Sometimes, though, it is the only way to resolve these types of memory problems.

Example

First, use the following commands to define and start a PEX session:

ADDPEXDFN DFN(TRHEAP) TYPE(*TRACE) JOB(*ALL) TASK(*ALL) MAXSTG(1000000) SLTEVT(*YES) STGEVT((*USRHEAP)) TEXT('User heap trace')

Note: This PEX Definition is for ALL jobs on the system. You can narrow this down to a specific job or jobs to limit the amount of data collected.

STRPEX SSNID(TRHEAP) DFN(TRHEAP)

Then, re-create the issue that is causing the error. Once the issue has been recreated, use the following command to end the PEX trace:

ENDPEX SSNID(TRHEAP) RPLDTA(*YES) TEXT('Heap data')

Use the following OVRDBF commands to convert the PEX data into a form that can be used with SQL.

Note: The default library for the PEX commands is QPEXDATA. If you use a different library to store the data, substitute it in the TOFILE parameter for the following commands.

OVRDBF FILE(QAYPEHEAP) TOFILE(QPEXDATA/QAYPEHEAP) MBR(TRHEAP)
OVRDBF FILE(QAYPETIDX) TOFILE(QPEXDATA/QAYPETIDX) MBR(TRHEAP)
OVRDBF FILE(QAYPETASKI) TOFILE(QPEXDATA/QAYPETASKI) MBR(TRHEAP)

Once the data has been collected, we will use SQL queries to sift through it. The STRSQL command will bring up a command line for entering SQL. These queries assume a working knowledge of SQL and the PEX data files. For more information on PEX and its corresponding data files, you should visit the Performance section of the IBM Knowledge Center:
https://www.ibm.com/docs/en/i/7.5?topic=performance

To locate all outstanding allocations in the trace, we need the following information:

o Their sizes and heap addresses
o Three levels of callers
o Heap name
o Record numbers

Thus, we will run the following query:
SELECT DISTINCT HEX(qhpasa),qhpasz,HEX(qhphcs),qhpopr,HEX(qhpca1),HEX(qhpca2),HEX(qhpca3),qrecn
FROM qaypeheap
WHERE qhpasa IN
(SELECT t.qhpasa FROM qaypeheap t
WHERE (t.qhpopr=0)
GROUP BY t.qhpasa
HAVING COUNT(*) > (SELECT COUNT(*) FROM qaypeheap u
WHERE (u.qhpopr=1) AND t.qhpasa = u.qhpasa
GROUP BY u.qhpasa)
)
This query can be long running and contain too much data to be very useful, though, especially if the problem is complex. The following set of queries can help to break the results down into a more manageable format.

Note: As stated before, these queries assume the trace data is stored in QPEXDATA. If the data is in a different library, replace QPEXDATA with the name of that library.

All Heap Operations

This query will retrieve the number of records stored in QAYPEHEAP.
SELECT COUNT(*) FROM qpexdata/qaypeheap

All calls to malloc()

The following view will contain the subset of all of the malloc() operations within QAYPEHEAP.

/*Delete any previous instances of the 'mallocs' view. */
DROP VIEW qpexdata/mallocs

/*Create the 'mallocs' view. */
CREATE VIEW qpexdata/mallocs AS SELECT qrecn, qhpasa, qhpasz
FROM qpexdata/qaypeheap WHERE qhpopr=0

/*This query will display all of the entries in the 'mallocs' view.*/
SELECT qrecn,HEX(qhpasa), qhpasz FROM qpexdata/mallocs

All calls to free()

The following view will contain the subset of all of the free() operations within QAYPEHEAP.

/*Delete any previous instances of the 'frees' view. */
DROP VIEW qpexdata/frees

/*Create the 'frees' view. */
CREATE VIEW qpexdata/frees
AS SELECT qrecn, qhpasa, qhpasz
FROM qpexdata/qaypeheap
WHERE qhpopr=1

/*This query will display all of the entries in the 'frees' view.*/
SELECT qrecn,HEX(qhpasa), qhpasz FROM qpexdata/frees


The last free() done for each memory location

The following view will contain the last free() done for each memory location

/*Delete any previous instance of the 'lastfrees' view. */
DROP VIEW qpexdata/lastfrees

/*Create the 'lastfrees' view. */
CREATE VIEW qpexdata/lastfrees
AS SELECT MAX(qrecn) AS lastfree_qrecn, qhpasa
FROM qpexdata/frees
GROUP BY qhpasa

/*This query will display all of the entries in the 'lastfrees' view.*/
SELECT lastfree_qrecn, HEX(qhpasa) FROM qpexdata/lastfrees

/*This query returns a count of the number of entries contained in 'lastfrees'.*/
SELECT COUNT(*) FROM qpexdata/lastfrees

Calls to malloc() from the beginning of the trace to the middle of the trace

The following view will contain all of the calls to malloc() that occurred from the beginning of the trace to the middle of the trace (This should eliminate looking at the mallocs that occurred right before ending the trace, before a free could be done)

Note: The following queries assume the PEX data did not wrap. If the PEX data has wrapped, you will need to manually calculate the midpoint and substitute it where we use "SELECT COUNT(*)/2 FROM qpexdata/qaypeheap"

/*Delete any previous instance of the 'earlymallocs' view. */
DROP VIEW qpexdata/earlymallocs

/*Create the 'earlymallocs' view.*/
CREATE VIEW qpexdata/earlymallocs
AS SELECT QRECN, qhpasa, qhpasz FROM qpexdata/qaypeheap
WHERE qhpopr=0 AND
qrecn < (SELECT COUNT(*)/2 FROM qpexdata/qaypetidx)

/*This query displays all of the entries in the 'earlymallocs' view.*/
SELECT qrecn, HEX(qhpasa), qhpasz FROM qpexdata/earlymallocs

The last call to malloc() done for each address

The following view will contain the subset of all of the last calls to malloc() for each memory address.

/*Delete any previous instance of the 'lastmalloconfirsthalf' view.*/
DROP VIEW qpexdata/lastmalloconfirsthalf

/*Create the 'lastmalloconfirsthalf' view.*/
CREATE VIEW qpexdata/lastmalloconfirsthalf
AS SELECT max(qrecn) AS lastmalloc_qrecn, qhpasa
FROM qpexdata/earlymallocs
GROUP BY qhpasa

/*This query displays all of the entries in the 'lastmalloconfirsthalf' view.*/
SELECT lastmalloc_qrecn, HEX(qhpasa) FROM qpexdata/lastmalloconfirsthalf


All calls to malloc() in the first half of the trace that do not have a corresponding call to free() in the ENTIRE trace

This view will contain the subset of the record numbers for all of the calls to malloc() done in the first half of the trace that do not have a corresponding call to free() in the entire trace.

/*Delete any previous instance of the 'allLeakedMallocs' view.*/
DROP VIEW qpexdata/allLeakedMallocs

/*Create the view 'allLeakedMallocs'.*/
CREATE VIEW qpexdata/allLeakedMallocs
AS SELECT m.lastmalloc_qrecn
FROM qpexdata/lastfrees f, qpexdata/lastmalloconfirsthalf m
WHERE f.qhpasa = m.qhpasa AND
f.lastfree_qrecn < m.lastmalloc_qrecn

/*This query displays all of the entries in the 'allLeakedMallocs' view.*/
SELECT * FROM qpexdata/allLeakedMallocs

Now that we've created our views, we can combine them in our queries to produce useful results.

Example Queries

In some of these queries, we will be retrieving procedure names in the order that they are listed in the call stack. In order for these queries to work, we will need five overrides of the file QAYPEPROCI: one for each level of the call stack.
OVRDBF FILE(PROCI5) TOFILE(QPEXDATA/QAYPEPROCI) MBR(TRHEAP)
OVRDBF FILE(PROCI4) TOFILE(QPEXDATA/QAYPEPROCI) MBR(TRHEAP)
OVRDBF FILE(PROCI3) TOFILE(QPEXDATA/QAYPEPROCI) MBR(TRHEAP)
OVRDBF FILE(PROCI2) TOFILE(QPEXDATA/QAYPEPROCI) MBR(TRHEAP)
OVRDBF FILE(PROCI1) TOFILE(QPEXDATA/QAYPEPROCI) MBR(TRHEAP)

The following queries are only examples. There is no set order in which an application developer need run these, nor will they all be useful in each situation.

/*This query will return the procedure name from the 5th level in the call stack for each leaked call to malloc().*/
SELECT SUBSTR(e.qprpnm, 1, 100) AS lvl5
FROM proci5 e, qpexdata/qaypeheap
WHERE e.qprkey = qhpck5 AND
qrecn IN (SELECT * FROM qpexdata/allLeakedMallocs)
ORDER BY qrecn

/* This query will return all 5 levels of the call stack for each leaked call to malloc().*/
SELECT
SUBSTR(b.qprpnm, 1, 50) AS lvl1,
SUBSTR(b.qprpnm, 1, 50) AS lvl2,
SUBSTR(c.qprpnm, 1, 50) AS lvl3,
SUBSTR(d.qprpnm, 1, 50) AS lvl4,
SUBSTR(e.qprpnm, 1, 50) AS lvl5
FROM proci1 a, proci2 b, proci3 c, proci4 d, proci5 e, qpexdata/qaypeheap
WHERE a.qprkey = qhpck1 AND b.qprkey = qhpck2 AND c.qprkey = qhpck3
AND d.qprkey = qhpck4 AND e.qprkey = qhpck5 AND
qrecn IN (SELECT * FROM qpexdata/allLeakedMallocs)
ORDER BY qrecn

/*This query returns the 5 caller instructions for each leaked call to malloc().*/
SELECT HEX(qhpca1), HEX(qhpca2), HEX(qhpca3), HEX(qhpca4), HEX(qhpca5) FROM qpexdata/qaypeheap
WHERE qrecn IN (SELECT * FROM qpexdata/allLeakedMallocs)
ORDER BY qrecn

/* This query will show the number of leaked objects, grouped by the size in bytes */
SELECT COUNT(*) AS NUM_LEAKED, qhpasz FROM pexdata/qaypeheap
WHERE qrecn IN (SELECT * FROM pexdata/allLeakedMallocs)
GROUP BY qhpasz
/*Locate all the records for a particular memory address. E8E2C92E6810E6A0 is an example addresses. In its place, you would substitute a memory address identified in a previous query.*/
/*Each record will be accompanied by an operation (0-malloc(), 1-free()).*/
SELECT qhpopr, HEX(qhpasa), HEX(qhpca1), HEX(qhpca2), HEX(qhpca3),
HEX(qhpca4), HEX(qhpca5)
FROM qpexdata/qaypeheap
WHERE QHPASA = X'E8E2C92E6810E6A0'
/* Given a memory address, locate its caller information, timestamp, size and heap address. */
/* Again, E8E2C92E6810E6A0 is an example address. Substitute a memory address identified in a previous query. */
SELECT qtitsp, qhpasz, qhpopr, HEX(qhpca1), HEX(qhpca2), HEX(qhpca3)
FROM qpexdata/qaypeheap, qpexdata/qaypetidx
WHERE qhpasa = X'E8E2C92E6810E6A0'
AND qaypeheap.qrecn = qaypetidx.qrecn
/* Get all the different return codes in the trace. */
/* Note that this is only meaningful for MI heaps; SLIC heap operations don't have a return code. */
SELECT DISTINCT qhpret
FROM qpexdata/qaypeheap
/* This query returns data about records with a return code of 90. */
SELECT qhphcs,HEX(qhpasa),qhpasz,qhpopr,HEX(qhpca1),HEX(qhpca2),HEX(qhpca3), qhpmsc
FROM qpexdata/qaypeheap
WHERE qhpret = 90
/* Select the address, operation and caller for all the records for segment F2172FABB1000000: */
SELECT HEX(qhpasa), qhpopr, HEX(qhpca1)
FROM qpexdata/qaypeheap
WHERE SUBSTR(qhpasa,1,5) = X'F2172FABB1'
/* This show how to select a timestamp, size, address, and control segment for all the records with timestamps of 3:04:01 PM on 4/18/2010: */
SELECT qtitsp, qhpasz, HEX(qhpasa), HEX(qhphcs)
FROM qpexdata/qaypeheap a, qpexdata/qaypetidx b
WHERE a.qrecn = b.qrecn
AND CHAR(qtitsp) LIKE '2010-04-18-15.04.01%'
/* Select jobnumber for a specific record number. Here, we use an example record number of 206343: */
SELECT qtsjnb
FROM qaypeheap a,qaypetidx b ,qaypetaski c
WHERE a.qrecn = b.qrecn AND a.qrecn = 206343
and b.qtiftc = c.qtstct
/*Get all of the information for the leaked allocations into a view. */
CREATE VIEW qpexdata/leakedMallocInfo
AS SELECT * FROM qpexdata/qaypeheap
WHERE qrecn IN (SELECT * FROM qpexdata/allLeakedMallocs)
/* Find the total size and number of allocations leaked by each call chain from the above set of queries: */
SELECT HEX(qhpca1), HEX(qhpca2), HEX(qhpca3), HEX(qhpca4), HEX(qhpca5),
SUM(qhpasz) AS AMT_LEAKED, COUNT(*) AS NUM_LEAKED,qhpasz
FROM qpexdata/leakedMallocInfo
GROUP BY qhpca1,qhpca2,qhpca3,qhpca4,qhpca5,qhpasz
These are only example queries. We encourage developers that are debugging memory corruption issues to modify these queries as they see fit and write new queries to help them debug their specific issue.

[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHtAAM","label":"Programming ILE Languages"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions"}]

Historical Number

585968101

Document Information

Modified date:
09 October 2025

UID

nas8N1011792