IBM Support

IV63695: MULTI-HIT ERAT MACHINE CHECK APPLIES TO AIX 7100-03

A fix is available

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • System crash caused by the erroneous creation of an mmap alias
    translation of the wrong page size.  The problem requires the
    following conditions:  1. 64KB MPSS support enabled (vmo
    tunable vmm_mpsize_support>=2) 2. MAP_PRIVATE of a POSIX RT
    shared memory object 3. Load references to a 64K aligned and
    sized part of the object 4. Then call fork() 5. Additional load
    references to the same 64K part of the object
    
    After the system crash, an error log entry similar to this one
    will be reported:
    
    (18)> errpt ERRORS NOT READ BY ERRDEMON (ORDERED
    CHRONOLOGICALLY):
    
    Error Record:  erec_flags ..............        1 erec_len
    ................       D0 erec_timestamp .......... 52FB83C0
    erec_rec_len ............       AC erec_cid
    ................        0 erec_dupcount ...........        0
    erec_duptime1 ...........        0 erec_duptime2
    ...........        0 erec_rec.error_id ....... 56CDC3C8
    MACHINE_CHECK_CHRP erec_rec.resource_name .. sysplanar0
    
      Machine Check - RTAS log Version 6 Details:  Severity:
      3 (Error Sync) Disposition:     2 (Not Recovered)
      Initiator:       1 (Cpu) Target:          0 (Unknown)
      Type:            0 (Unknown)
        - Unrecoverable error FRU ID:      2           Processor
      ID:     18 Machine Check Type:   2 - ERAT Error No
      duplicate/overlapping entries detected in the SLB
    
    09000000 0001B980 80000000 0020F032  ............. .2 06741000
    00000080 C4008E00 00000000  .t..............  00000000 49424D00
    50480030 06000000  ....IBM.PH.0....  00000000 00000000 00000000
    00000000  ................  48000003 00000000 00000000
    00000000  H...............  00000000 00000000 55480018
    06000000  ........UH......  10004000 00000000 00002000
    00000000  ..@....... .....  4D430028 06000000 00000002
    00000012  MC.(............  02820000 00000000 07000000
    E3A24270  ..............Bp 00000000
    00000000                     ........
    

Local fix

Problem summary

  • System crash caused by the erroneous creation of an mmap alias
    translation of the wrong page size.  The problem requires the
    following conditions:  1. 64KB MPSS support enabled (vmo
    tunable vmm_mpsize_support>=2) 2. MAP_PRIVATE of a POSIX RT
    shared memory object 3. Load references to a 64K aligned and
    sized part of the object 4. Then call fork() 5. Additional load
    references to the same 64K part of the object
    
    After the system crash, an error log entry similar to this one
    will be reported:
    
    (18)> errpt ERRORS NOT READ BY ERRDEMON (ORDERED
    CHRONOLOGICALLY):
    
    Error Record:  erec_flags ..............        1 erec_len
    ................       D0 erec_timestamp .......... 52FB83C0
    erec_rec_len ............       AC erec_cid
    ................        0 erec_dupcount ...........        0
    erec_duptime1 ...........        0 erec_duptime2
    ...........        0 erec_rec.error_id ....... 56CDC3C8
    MACHINE_CHECK_CHRP erec_rec.resource_name .. sysplanar0
    
      Machine Check - RTAS log Version 6 Details:  Severity:
      3 (Error Sync) Disposition:     2 (Not Recovered)
      Initiator:       1 (Cpu) Target:          0 (Unknown)
      Type:            0 (Unknown)
        - Unrecoverable error FRU ID:      2           Processor
      ID:     18 Machine Check Type:   2 - ERAT Error No
      duplicate/overlapping entries detected in the SLB
    
    09000000 0001B980 80000000 0020F032  ............. .2 06741000
    00000080 C4008E00 00000000  .t..............  00000000 49424D00
    50480030 06000000  ....IBM.PH.0....  00000000 00000000 00000000
    00000000  ................  48000003 00000000 00000000
    00000000  H...............  00000000 00000000 55480018
    06000000  ........UH......  10004000 00000000 00002000
    00000000  ..@....... .....  4D430028 06000000 00000002
    00000012  MC.(............  02820000 00000000 07000000
    E3A24270  ..............Bp 00000000
    00000000                     ........
    

Problem conclusion

  • Fix the page fault handler code that chooses the page size to
    use for an mmap source alias translation entry.
    

Temporary fix

Comments

  • 6100-09 - use AIX APAR IV63700
    7100-03 - use AIX APAR IV63695
    

APAR Information

  • APAR number

    IV63695

  • Reported component name

    AIX V7.1

  • Reported component ID

    5765H4000

  • Reported release

    710

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Submitted date

    2014-08-18

  • Closed date

    2014-08-18

  • Last modified date

    2015-05-19

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

    IV63700 IV63860

Fix information

  • Fixed component name

    AIX V7.1

  • Fixed component ID

    5765H4000

Applicable component levels

  • R710 PSY U865837

       UP15/05/19 I 1000

PTF to Fileset Mapping

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMV87","label":"AIX Enterprise Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":""},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSMVAX","label":"AIX Express Edition"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":""},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG11R","label":"APARs - AIX 7.1 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"710","Edition":""}]

Document Information

Modified date:
19 May 2015