IBM Support

LO45391: SERVER CRASHED OR HANG INTERMITTENTLY WHEN AGENT RUN WITH OLE32.DLL INVOLVED.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as fixed if next.

Error description

  • Customer initially reported that the server was hanging.
    Investigations on the
    debug files showed consistent semaphore locks(over 3500 logs
    within a hour
    period):
    
    11/09/2009 09:01:10 CEDT sq="00001C86" THREAD [1120:0171-1428]
    WAITING FOR SEM
    0xAE0C  (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms
    11/09/2009 09:01:15 CEDT sq="00001C87" THREAD [1120:012B-13C8]
    WAITING FOR SEM
    0xAE0C  (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms
    :
    :
    11/09/2009 10:09:05 CEDT sq="00003454" THREAD [1120:0091-13C4]
    WAITING FOR SEM
    0xAE0C  (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms
    11/09/2009 10:09:05 CEDT sq="00003455" THREAD [1120:011B-13F4]
    WAITING FOR SEM
    0xAE0C  (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms
    
    The locking thread can be traced to the SERVER task as shown
    below:
    
    Date         : Fri Sep 11 10:04:29 2009
    NSD Version  : 7.0.23.7347 (Release 7.0.2FP3)
    OS Version   : Windows Server 2003 5.2 (Build 3790), PlatID=2,
    Service Pack 2
    (8 Processors)
    Notes Version: Release 7.0.2FP3 (32-bit server)
    ############################################################
    ### thread 28/83: [ nserver:  1120:  13f8]
    ### FP=085dee60, PC=77521d8f, SP=085dee10
    ### stkbase=085e0000, total stksize=262144, used stksize=4592
    ############################################################
     [ 1] 0x77521d8f ole32.CLSIDFromString+214
    (2,35e000fc,35e000f8,0)
     [ 2] 0x7753a3f6 ole32.CoSetProxyBlanket+12222
    (2,35e001a0,35e000f8,80004002)
     [ 3] 0x7753a583 ole32.CoSetProxyBlanket+12619
    (35e000f8,85deec8,7753a675,1)
     [ 4] 0x7753a4f0 ole32.CoSetProxyBlanket+12472
    (1,0,35e000f8,85deed4)
     [ 5] 0x7753a675 ole32.CoSetProxyBlanket+12861
    (80000000,85deef4,7750f0e1,35e000f8)
     [ 6] 0x77540280 ole32.CoGetObjectContext+12515
    (35e000f8,85defa0,85defa4,2f85e8c8)
     [ 7] 0x7750f0e1 ole32.GetErrorInfo+2929
    (85def1c,774f19b4,2f85e8c8,4)
     [ 8] 0x77530cf0 ole32.CoGetComCatalog+9333
    (85defa0,85defa4,a9868,2f8bea60)
     [ 9] 0x77530b8f ole32.CoGetComCatalog+8980
    (a986c,0,7760c28e,85df084)
     [10] 0x77530c5b ole32.CoGetComCatalog+9184
    (a986c,a9868,a9868,a9868)
     [11] 0x77530d2a ole32.CoGetComCatalog+9391
    (85df014,774fbbec,a9868,4)
     [12] 0x775217da ole32.CoRegisterClassObject+1732
    (85df084,2f7f06c8,85df118,8007000e)
     [13] 0x77521786 ole32.CoRegisterClassObject+1648
    (85df0a0,6333b068,2f7f06c8,0)
     [14] 0x775211a3 ole32.CoRegisterClassObject+141
    (85df0e0,2f7f06c8,4,1)
    @[15] 0x600fd817
    nnotes.LSRcAdapterRegistry::OleCreateAdtClassFactory+295
    (37870f60,2f7f06c8,4,85df358)
    @[16] 0x63201724 nlsxbe.LsxMsgProc@12+404
    (10004,2ee4f35c,6333b068)
    @[17] 0x6013c8b3 nnotes.DLLNode::Register+35
    (2e641688,f3378d4,f340000,f34cb94)
    @[18] 0x6013c54a nnotes.LSIClassRegModule::AddLibrary+170
    (85df444,85df430,85df428,6013bfad)
    @[19] 0x6013c023 nnotes.LSISession::RegisterClassLibrary+19
    (f3378d4,85df444,f34cb94,85df430)
    @[20] 0x6013bfad nnotes.LSISession::RegisterClassLibrary+173
    (f3378d4,85df444,f34cb94,f333d98)
    @[21] 0x60a13a8c nnotes.LSCreateScriptSession@20+124
    (f333d98,60bfe278,60b4c8e8,0,0)
    @[22] 0x60a13afa nnotes.LSLotusScriptInit@4+26 (f333d98)
    @[23] 0x60a0d9bd nnotes.CLSIDocument::Init+29
    (f333d94,f333d94,605affa0,1f14e94c)
    @[24] 0x605b0688 nnotes.AgentRun@16+296 (f34e014,f333d94,0,10)
    @[25] 0x10059506 nserverl.ServerRunServerAgent@8+438
    (4e200019,5ea80017)
    @[26] 0x1001dd72 nserverl.DbServer@8+2354 (d5ba00e9,4e200019)
    @[27] 0x100314cd nserverl.WorkThreadTask@8+1485 (a,0)
    @[28] 0x100018a8 nserverl.Scheduler@4+744 (0)
    @[29] 0x60104370 nnotes.ThreadWrapper@4+208 (0)
     [30] 0x7c82482f kernel32.GetModuleHandleA+223 (0,0,0,0)
    
    As can be seen from the locking thread, the running of an agent
    that involved
    OLE32.DLL caused the lock.
    
    Further investigation showed that agent 'test' ran
    just before semaphore locks were observed.
    
    Server.Task = Agent Manager: Executive '3': Running agent
    'test' in 'XXX.nsf': [09/11/2009 09:45:40 CEDT]
    
    and
    
    [1540:0002-1544] 11.09.2009 08:45:40   AMgr: Start executing
    agent
    'test' in 'XXX.nsf' by Executive '3'
    [1540:0002-1544] 11.09.2009 08:45:40   AMgr: 'ZZZ' is the agent
    signer of agent
    'test' in 'XXX.nsf'
    [1540:0002-1544] 11.09.2009 08:45:40   AMgr: 'Agent
    'test' in 'XXX.nsf' will run on behalf of 'ZZZ'
    [1540:0002-1544] 11.09.2009 08:45:40   AMgr: Agent
    'test' in database 'XXX.nsf' signed by 'ZZZ' is
    running in Full Administrator mode
    
    Analysis on shared memory did not revealed anything out of
    ordinary.
    
    When the 'processUnCheckedDocumentsInPF' agent was disabled, the
    server did not
    hang but crashed instead.
    
    Investigation on the crash stack showed the server crashed with
    the following
    task:
    
    Date            : Wed Oct 07 07:53:48 2009
    Arguments       : "C:\Programme\Lotus\Domino\nsd.exe"
    -dumpandkill -termstatus
    5 -crashpid 2816 -crashtid 2748
    NSD Version     : 7.0.23.8057 (Release 7.0.2FP3)
    OS Version      : Windows/2003 5.2 [32-bit] (Build 3790),
    PlatID=2, Service
    Pack 2 (8 Processors)
    Domino Version  : Release 7.0.2FP3 (32-bit server)
    ############################################################
    ### FATAL THREAD 51/86 [ nserver:  0b00:  0abc]
    ### FP=0x08e0efd4, PC=0x77530950, SP=0x08e0efb0
    ### stkbase=08e10000, total stksize=262144, used stksize=4176
    ### EAX=0x000004a5, EBX=0x00000000, ECX=0x00000000,
    EDX=0x776166a8
    ### ESI=0x00000000, EDI=0x00000000, CS=0x0000001b,
    SS=0x00000023
    ### DS=0x00000023, ES=0x00000023, FS=0x0000003b, GS=0x00000000
    Flags=0x00010246
    Exception code: c0000005 (ACCESS_VIOLATION)
    ############################################################
     [ 1] 0x77530950 ole32.CoGetComCatalog+8405
    (af68c,af688,af688,af688)
     [ 2] 0x77530d2a ole32.CoGetComCatalog+9391
    (8e0f014,774fbbec,af688,4)
     [ 3] 0x775217da ole32.CoRegisterClassObject+1732
    (8e0f084,d0ed0,8e0f118,8007000e)
     [ 4] 0x77521786 ole32.CoRegisterClassObject+1648
    (8e0f0a0,6333b068,d0ed0,0)
     [ 5] 0x775211a3 ole32.CoRegisterClassObject+141
    (8e0f0e0,d0ed0,4,1)
    @[ 6] 0x600fd817
    nnotes.LSRcAdapterRegistry::OleCreateAdtClassFactory+295
    (26c5a9b0,d0ed0,4,8e0f358)
    @[ 7] 0x63201724 nlsxbe.LsxMsgProc@12+404
    (10004,17a3e04c,6333b068)
    @[ 8] 0x6013c8b3 nnotes.DLLNode::Register+35
    (26943788,b2786d4,b270000)
    @[ 9] 0x6013c54a nnotes.LSIClassRegModule::AddLibrary+170
    (8e0f444,8e0f430,8e0f428)
    @[10] 0x6013c023 nnotes.LSISession::RegisterClassLibrary+19
    (b2786d4,8e0f444,b275114)
    @[11] 0x6013bfad nnotes.LSISession::RegisterClassLibrary+173
    (b2786d4,8e0f444,b275114)
    @[12] 0x60a13a8c nnotes.LSCreateScriptSession@20+124
    (b26ca98,60bfe278,60b4c8e8,0,0)
    @[13] 0x60a13afa nnotes.LSLotusScriptInit@4+26 (b26ca98)
    @[14] 0x60a0d9bd nnotes.CLSIDocument::Init+29 (b26ca94)
    @[15] 0x605b0688 nnotes.AgentRun@16+296 (b27f814,b26ca94,0,10)
    @[16] 0x10059506 nserverl.ServerRunServerAgent@8+438
    (11340010,74ac003b)
    @[17] 0x1001dd72 nserverl.DbServer@8+2354 (6e740138,11340010)
    @[18] 0x100314cd nserverl.WorkThreadTask@8+1485 (4,0)
    @[19] 0x100018a8 nserverl.Scheduler@4+744 (0)
    @[20] 0x60104370 nnotes.ThreadWrapper@4+208 (0)
     [21] 0x7c82482f kernel32.GetModuleHandleA+223
    
    Once again "XXX.nsf" was involved and agent "test1" was somehow
    involved.
    
    However investigations on console.log showed that "test1" agent
    was run
    successfully earlier.
    
    Although the two server outages are different (one server hangs
    and the other a
    server crash), there are some similarities in the function calls
    which
    eventually leads to the server outages reported.
    
    Further analysis showed that the crash above may be a match to
    SPR# JSSI7PD47Q
    - Agent running in server cause the server crash. However the
    SPR has since
    been closed as no plans to fix in Domino 6, 7 and not
    reproducible in Domino 8.
    

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

  • This APAR is associated with SPR# BKYP7WUDQG.
    If this issue occurs in a later release, a fix will be
     investigated
    

APAR Information

  • APAR number

    LO45391

  • Reported component name

    DOMINO SERVER

  • Reported component ID

    5724E6200

  • Reported release

    700

  • Status

    CLOSED FIN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2009-10-15

  • Closed date

    2009-10-24

  • Last modified date

    2009-10-24

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • NA
    

Fix information

Applicable component levels

  • R700 PSN

       UP

[{"Business Unit":{"code":"BU055","label":"Cognitive Applications"},"Product":{"code":"SSKTMJ","label":"Lotus Domino"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
24 October 2009