IBM Support

Diagnosing DFHSM0102 storage violations in CICS TS

Troubleshooting


Problem

You receive message DFHSM0102 A storage violation (code X'code') has been detected.  The code that uniquely identifies the type of storage violation is likely to be X'0D11'  X'0F0C' or X'030B'.  The storage violation might also lead to various abends. You would like to know how to troubleshoot or determine what is causing the storage violation in CICS Transaction Server for z/OS (CICS TS).

Cause

A storage violation occurs when a transaction attempts to modify storage that it does not own. This could be a one byte overlay or a massive multiple page overlay that can lead to the crash of an entire CICS region. Storage violations can be divided into two classes, namely those detected and reported by CICS, and those not detected by CICS.  As such, the overlays can manifest as a number of different abends and error messages in CICS, but the most common storage violation is identified by message DFHSM0102
The DFHSM0102 message is issued when CICS has detected a storage violation because one of the following two situations has occurred:
  • The trailing storage accounting area (SAA) or the initial SAA of a TIOA storage element has become corrupted.
  • The leading storage check zone (SCZ) or the trailing SCZ of a user-task storage element has become corrupted.
SAAs and SCZs are eight-byte identifiers that are used to identify where storage that is used by the transaction should begin and end. CICS checks these identifiers upon certain events such as FREEMAINs to make sure that transactions are only modifying storage that is owned by the transaction. 

Diagnosing The Problem

Follow these steps to determine the cause of a storage violation in CICS TS:
 
  1. Locate the DFHSM0102 message. 
    In the DFHSM0102 message, a X'code' like X'0D11', X'0F0C', and X'030B' is provided that corresponds to a trace entry in the CICS trace. Locate the DFHSM0102 message and take note of your specific X'code'. This will be used to find the trace entry that will provide some valuable information as to where the storage overlay is occurring.
     
    DFHSM0102 applid A storage violation (code X'code') has been detected by module modname 
    

  2. Locate the Storage manager domain trace point in the CICS TS documentation that corresponds with the X'code' in the DFHSM0102 message.
    For example, if the X'code' is X'0F0C', a find for "0F0C" in the documentation will take you to the trace point id for SM 0F0C which contains an explanation of each data area in the trace entry.  In this case, data area 2 contains the "Address of the storage element" ...
     
    Point ID   Module      Lvl            Type                       Data     
    SM 0F0C    DFHSMAR     Exc     Storage check failure     1 SMAR parameter list 
                                                             2 Address of storage element  
                                                             3 Length of storage element 
                                                             4 First 512 bytes(max) of storage element
                                                             5 Last 512 bytes (max) of storage element 
                                                             6 Data preceding storage element (1K max 
                                                             7 Data following storage element (1K max)

  3. Use IPCS to view CICS internal trace in the SM0102 dump and find the trace entry that reports the exception.
    A system dump is taken for message DFHSM0102 as long as you do not specifically suppress dumps in the dump table. When viewing the dump using IPCS, enter command VERBX DFHPDnnn 'TR=2', where nnn is the version of CICS you are running. For example, use DFHPD720 for CICS TS 5.5.
     
    CICS Release        Verb Exit DFHPDnnn
       5.3                   700
       5.4                   710
       5.5                   720
       5.6                   730
       ...                   ...

    You can reference Webcast replay: CICS Debugging Basics - Using IPCS for CICS dump analysis for additional information on using IPCS. 

  4. Once you are viewing CICS internal trace, do a find for "*EXC*" to find the exception trace entry associated with the storage check failure.
    Full trace entry from IPCS command VERBX DFHPDxxx 'TR=2'
    
    SM 0F0C SMAR *EXC* -Storage_check_failed_at_address -001007D0 
                                           FUNCTION(RELEASE_TRANSACTION_STG)  
    TASK-XM KE_NUM-0049 TCB-QR /008CCE88 RET-91C40B8E TIME-15:31:20.60 INTERVAL-0.0000011 =003705=
    1-00 00280000 000000D1 00000000 00000000 B0000000 00000000 02000100 00000000 *.......J.......* 
      20 00000000 00000000                                                       *........       *
    2-00 001007D0                                                                *...}           *
    3-00 000003D0                                                                *...}           * 
    4-00 C2F0F0F0 F0F5F1F5 00000000 00000000 00000000 00000000 00000000 00000000 *B0000515.......* 
      20 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
      40 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
      60 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
      80 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
      A0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
      C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
      E0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     100 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     120 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     140 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     160 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     180 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     1A0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............* 
     1C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
     1E0 00000000 00000000                                                       *........       *
    5-00 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
      20 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*  
      40 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
         . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . .  *...............*
     160 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
     180 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
     1A0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 *...............*
     1C0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00400020 *............ ..*
     1E0 000400C8 00C800C8                                                       *...H.H.H       *
    (continued)

  5. Notice that the beginning and ending check zones of the storage element are also in the data areas of the trace entry.
    Continuing with the X'0F0C' example, you will notice that data areas 4 and 5 contain the beginning and trailing check zones of the storage element. These check zones can  be used to determine which SAA or SCZ has been violated. The above full trace entry shows a normal, non-violated leading check zone with the eight-byte identifier of "B0000515"in the first line of data area 4:
     
    4-00 C2F0F0F0 F0F5F1F5 00000000 00000000 00000000 00000000 00000000 00000000 *B0000515.......* 

    The above full trace entry also shows a violated (overwritten) trailing check zone. Notice the last line of data area 5:
    1E0 000400C8 00C800C8                                                        *...H.H.H       *

    Each storage element should begin and end with the same eight-byte identifier.  It is possible to tell that this is a violated check zone because it does not end with the eight-byte identifier that is in the beginning check zone (B0000515).
     
  6. Review the storage that is being overlaid to determine if application data is overlaying the check zone.
    Many storage violations are solved by an application programmer or owner recognizing the data in the storage being overlaid. You might not initially recognize the data that has overlaid the check zone in question, but the data often follows a very distinct pattern that makes it easy to recognize.

    You will be able to view additional storage by performing a browse of storage located at the address of the beginning and (or) end of the storage element. The address of the storage element and storage length can be found in the data elements of the exception trace entry. The end address can be derived by adding the length of the storage element to the address of the storage element (001007D0 plus x'3D0' in this case). A browse can be performed by navigating to option **1** from the IPCS primary option menu. There you will find a line where you can enter the address you would like to begin browsing in storage.  

  7. Discover the return address (RET address) for the program that issued the getmain for the storage.
    To find the getmain, first add x'8' to the address of the storage element reported in the exception trace entry (001007D0 plus x'8' in this case). Then while viewing the trace, perform a find on the resulting address (001007D8 in this case). This will find the trace entry showing the exit of DFHSMMG. Next back up in trace to find the corresponding EIP ENTRY GETMAIN trace entry. The RET value will point to the program that issued the getmain. Here is an example:
     
    AP 00E1  EIP ENTRY GETMAIN  REQ(0004) FIELD-A(00100488 ...h) FIELD-B(08000C02 ....)     
                    TASK-00515 KE_NUM-0049 TCB-QR/008CCE88 RET-500C10A2  =003633=
    SM 0C01 SMMG  ENTRY - FUNCTION(GETMAIN) GET_LENGTH(3BC) SUSPEND(YES)
                                                STORAGE_CLASS(USER24) CALLER(EXEC)
                    TASK-00515 KE_NUM-0049 TCB-QR/008CCE88 RET-928B1AAC =003634= 
    1-00 00800000 00000011 00000000 00000000 B6580000 00000000 02BF014E 1244FFA8 *............+.*
      20 00100478 0000034E 000003BC 00000360 1244FFA8 01441201 11FD1A88 0005B680 *....+...-..h..*
      40 0005BB74 12BF60A0 11DF4F30 00000000 000001E0 11E2FD70 0000003C 11E2FDAC *...-..\.S...S.*
      60 000001E0 0010003C 00000000 00400000 BE9ACE34 7DF9F27E 00C2E7C8 003C00EF *.\.. 92=.BXH..*
    
    . . . . . . . . . . . . . . 
    
    SM 0C02 SMMG EXIT  - FUNCTION(GETMAIN) RESPONSE(OK) ADDRESS(001007D8)
                    TASK-00515 KE_NUM-0049 TCB-QR/008CCE88 RET-928B1AAC =003636=
    
    

  8. Use the loader domain to determine the program that issued the getmain for the storage. 
    In this case, it is possible to determine that program READUPDT getmained the storage in question because the RET address reported in the GETMAIN trace entry that corresponds to the exception trace entry falls within the load point for program READUPDT.  
     
    Output from IPCS command VERBX DFHPDxxx 'LD=1' 
    
                                 PROGRAM STORAGE MAP                                        
    PGM NAME ENTRY PT  CSECT   LOAD PT. REL. PTF LVL. LAST COMPILED  COPY NO. USERS  LOCN  TYP
    IBMRSAP  800ABA20 -noheda-000ABA20                                 1       1     RDSA  RPL
    READUPDT 000C1000 DFHYA640 000C1000 640                            1       0     SDSA  RPL
    DFHSIP   11C554B8 DFHCICS  11C00000 640  HCI6400  I  02/03 08.21   1       0     ERGN  ANY 
                      DFHKEDCL 11C00200 640  UK04769  06/23/05 12.51
                      -noheda- 11C007F8
                      DFHKEDRT 11C00800 640  HCI6400  03/02/05 06.28
                      -noheda-11C00DF8
                      DFHKESCL 11C00E00 640  HCI6400  03/02/05 06.28
    

    Alternatively, you can browse the dump data set starting at the RET address then scroll backwards until you find the eyecatcher of the module that issued the getmain.
Next Steps

If you are unable to identify the storage at this point, you will need to activate the Storage Violation Trap. This trap can be activated in two ways:

  • Code CHKSTSK=CURRENT in your System Initialization Table (SIT)
  • Enter CSFE DEBUG,CHKSTSK=CURRENT from a CICS terminal
 

Ensure all trace components are set to standard level 1 except for EI and AP which should be set to 1-2. Your trace table should be at least 6M. Once the trap springs you will receive a DFHSM0103 dump:

DFHSM0103 applid A storage violation (code X'code') has been detected by the storage violation trap. Trap is now inactive.

The trap will be deactivated after the dump is taken.

To assist in debugging the DFHSM0103 dump, you can review the Webcast: Debugging Storage Violations in CICS.

Contacting IBM Support

If the above steps have not helped you determine what caused the storage violation, it might be necessary to open a case to contact IBM CICS support for assistance. The CICS documentation provides information on the specific diagnostic data required by IBM CICS Support to debug storage violations
Additional Information 

Document Location

Worldwide

[{"Line of Business":{"code":"LOB35","label":"Mainframe SW"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSGMGV","label":"CICS Transaction Server"},"ARM Category":[{"code":"a8m0z00000007fvAAA","label":"CICS Transaction Server->Storage->Storage Violations"}],"ARM Case Number":"","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Version(s)"}]

Document Information

Modified date:
20 April 2021

UID

ibm16361761