Topic
  • 1 reply
  • Latest Post - ‏2010-11-02T15:04:50Z by SystemAdmin
SystemAdmin
SystemAdmin
228 Posts

Pinned topic IDS11.5 crashed on Windows

‏2010-11-02T02:57:53Z |
07:17:43 Maximum server connections 102
07:17:43 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 1016, Llog used 1308

07:18:25 Logical Log 412 Complete, timestamp: 0xda2e3c0.
07:21:33 Requested shared memory segment size rounded from 2432156KB to 2432192KB
07:21:33 shmat: EINVAL22: shared memory base address illegal
07:21:33 Contiguous shared memory segment allocation failed at 0x7FFF0000.
Allocation successful at 0xFFFFFFFF.
Check SHMBASE is consistent with the value in $INFORMIXDIR/etc/onconfig.std.
If you are using the correct SHMBASE value in your ONCONFIG file, then
consider this message informational only.
07:21:33 out of virtual shared memory

07:21:33 Assert Failed: Unhandled NT Exception!
07:21:33 See Also: C:\PROGRA~1\IBM\IBMINF~1\11.50\tmp\af.c5f4b7d
07:21:34 Assert Failed: Memory block header corruption detected in mt_shm_malloc_segid 9
07:21:34 IBM Informix Dynamic Server Version 11.50.TC6
07:21:34 Who: Session(1452, informix@MUMRVT11.cn.ibm.com, -1, 00000000)
Thread(1633, sqlexec, 0, 8)

File: mtshpool.c Line: 1533

SHMTOTAL 0

Q1: What cause the crash?
Q2: Can resolve this issue by limiting SHMTOTAL?

onconfg and logs are in the attachment.

Attachments

Updated on 2010-11-02T15:04:50Z at 2010-11-02T15:04:50Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    228 Posts

    Re: IDS11.5 crashed on Windows

    ‏2010-11-02T15:04:50Z  
    Hi

    Your IDS instance crashed because of memory corruption:
    
    07:21:33 Assert Failed: Unhandled NT Exception! 07:21:33 See Also: C:\PROGRA~1\IBM\IBMINF~1\11.50\tmp\af.c5f4b7d *07:21:34 Assert Failed: Memory block header corruption detected in mt_shm_malloc_segid 9* 07:21:34 IBM Informix Dynamic Server Version 11.50.TC6 07:21:34 Who: Session(1452, informix@MUMRVT11.cn.ibm.com, -1, 00000000) Thread(1633, sqlexec, 0, 8)
    


    It looks like it happened on additional shared memory segment allocation.

    If we look at the messages above,
    
    07:21:33 *Requested shared memory segment size rounded from _2432156KB to 2432192KB_* 07:21:33 shmat: EINVAL22: *shared memory base address illegal* 07:21:33 Contiguous shared memory segment allocation failed at 0x7FFF0000. Allocation successful at 0xFFFFFFFF. Check SHMBASE is consistent with the value in $INFORMIXDIR/etc/onconfig.std. If you are using the correct SHMBASE value in your ONCONFIG file, then consider 
    
    this message informational only. 07:21:33 out of virtual shared memory
    


    That looks weird because you have correct value for SHMBASE, as advised in machine notes:

    
    RESIDENT 0^M SHMBASE 0xc000000L^M SHMVIRTSIZE 32656^M SHMADD 8192^M EXTSHMADD 8192^M #SHMTOTAL 0^M #change by sheng gao^M SHMTOTAL 1536000^M ^M SHMVIRT_ALLOCSEG 0,3^M #SHMNOACCESS 0x70000000-0x7FFFFFFF^M
    


    Also it tried to allocate ~2.4 Gigs of shared memory - definitely weird since you have SHMTOTAL set to 1.5 Gigs.

    Do you have enabled option which allows IDS to use shared memory >2 Gigs? (text below taken from machine notes)

    
    2.  Support maximum shared memory   The database server can access shared-memory larger than two gigabytes on Windows. However, you must enable 
    
    this feature with an entry in the Windows boot file.   To add the entry, edit the boot.ini file (located in the top level, or root directory). You can either add a 
    
    new boot option or use the currently existing boot option. To enable support 
    
    for more than two gigabytes, add the following text to the end of the boot line:   /3GB
    


    Usually crashes related to a memory corruption happen due to some code defect. In this situation you should really check the AF file: C:\PROGRA~1\IBM\IBMINF~1\11.50\tmp\af.c5f4b7d

    It should have failure stack printed within. You can extract it and try to match it with known defects (APARs).

    Also it should have some set of 'onstat' outputs. Take a look at 'onstat -g ses 1452' to see what it was doing at the moment of crash. Maybe you will be able to reproduce the issue using this information.

    As the last resort call IBM Informix Support.