APAR status
Closed as fixed if next.
Error description
Customer initially reported that the server was hanging. Investigations on the debug files showed consistent semaphore locks(over 3500 logs within a hour period): 11/09/2009 09:01:10 CEDT sq="00001C86" THREAD [1120:0171-1428] WAITING FOR SEM 0xAE0C (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms 11/09/2009 09:01:15 CEDT sq="00001C87" THREAD [1120:012B-13C8] WAITING FOR SEM 0xAE0C (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms : : 11/09/2009 10:09:05 CEDT sq="00003454" THREAD [1120:0091-13C4] WAITING FOR SEM 0xAE0C (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms 11/09/2009 10:09:05 CEDT sq="00003455" THREAD [1120:011B-13F4] WAITING FOR SEM 0xAE0C (@015DB7F4) (OWNER=1120:13F8) FOR 30000 ms The locking thread can be traced to the SERVER task as shown below: Date : Fri Sep 11 10:04:29 2009 NSD Version : 7.0.23.7347 (Release 7.0.2FP3) OS Version : Windows Server 2003 5.2 (Build 3790), PlatID=2, Service Pack 2 (8 Processors) Notes Version: Release 7.0.2FP3 (32-bit server) ############################################################ ### thread 28/83: [ nserver: 1120: 13f8] ### FP=085dee60, PC=77521d8f, SP=085dee10 ### stkbase=085e0000, total stksize=262144, used stksize=4592 ############################################################ [ 1] 0x77521d8f ole32.CLSIDFromString+214 (2,35e000fc,35e000f8,0) [ 2] 0x7753a3f6 ole32.CoSetProxyBlanket+12222 (2,35e001a0,35e000f8,80004002) [ 3] 0x7753a583 ole32.CoSetProxyBlanket+12619 (35e000f8,85deec8,7753a675,1) [ 4] 0x7753a4f0 ole32.CoSetProxyBlanket+12472 (1,0,35e000f8,85deed4) [ 5] 0x7753a675 ole32.CoSetProxyBlanket+12861 (80000000,85deef4,7750f0e1,35e000f8) [ 6] 0x77540280 ole32.CoGetObjectContext+12515 (35e000f8,85defa0,85defa4,2f85e8c8) [ 7] 0x7750f0e1 ole32.GetErrorInfo+2929 (85def1c,774f19b4,2f85e8c8,4) [ 8] 0x77530cf0 ole32.CoGetComCatalog+9333 (85defa0,85defa4,a9868,2f8bea60) [ 9] 0x77530b8f ole32.CoGetComCatalog+8980 (a986c,0,7760c28e,85df084) [10] 0x77530c5b ole32.CoGetComCatalog+9184 (a986c,a9868,a9868,a9868) [11] 0x77530d2a ole32.CoGetComCatalog+9391 (85df014,774fbbec,a9868,4) [12] 0x775217da ole32.CoRegisterClassObject+1732 (85df084,2f7f06c8,85df118,8007000e) [13] 0x77521786 ole32.CoRegisterClassObject+1648 (85df0a0,6333b068,2f7f06c8,0) [14] 0x775211a3 ole32.CoRegisterClassObject+141 (85df0e0,2f7f06c8,4,1) @[15] 0x600fd817 nnotes.LSRcAdapterRegistry::OleCreateAdtClassFactory+295 (37870f60,2f7f06c8,4,85df358) @[16] 0x63201724 nlsxbe.LsxMsgProc@12+404 (10004,2ee4f35c,6333b068) @[17] 0x6013c8b3 nnotes.DLLNode::Register+35 (2e641688,f3378d4,f340000,f34cb94) @[18] 0x6013c54a nnotes.LSIClassRegModule::AddLibrary+170 (85df444,85df430,85df428,6013bfad) @[19] 0x6013c023 nnotes.LSISession::RegisterClassLibrary+19 (f3378d4,85df444,f34cb94,85df430) @[20] 0x6013bfad nnotes.LSISession::RegisterClassLibrary+173 (f3378d4,85df444,f34cb94,f333d98) @[21] 0x60a13a8c nnotes.LSCreateScriptSession@20+124 (f333d98,60bfe278,60b4c8e8,0,0) @[22] 0x60a13afa nnotes.LSLotusScriptInit@4+26 (f333d98) @[23] 0x60a0d9bd nnotes.CLSIDocument::Init+29 (f333d94,f333d94,605affa0,1f14e94c) @[24] 0x605b0688 nnotes.AgentRun@16+296 (f34e014,f333d94,0,10) @[25] 0x10059506 nserverl.ServerRunServerAgent@8+438 (4e200019,5ea80017) @[26] 0x1001dd72 nserverl.DbServer@8+2354 (d5ba00e9,4e200019) @[27] 0x100314cd nserverl.WorkThreadTask@8+1485 (a,0) @[28] 0x100018a8 nserverl.Scheduler@4+744 (0) @[29] 0x60104370 nnotes.ThreadWrapper@4+208 (0) [30] 0x7c82482f kernel32.GetModuleHandleA+223 (0,0,0,0) As can be seen from the locking thread, the running of an agent that involved OLE32.DLL caused the lock. Further investigation showed that agent 'test' ran just before semaphore locks were observed. Server.Task = Agent Manager: Executive '3': Running agent 'test' in 'XXX.nsf': [09/11/2009 09:45:40 CEDT] and [1540:0002-1544] 11.09.2009 08:45:40 AMgr: Start executing agent 'test' in 'XXX.nsf' by Executive '3' [1540:0002-1544] 11.09.2009 08:45:40 AMgr: 'ZZZ' is the agent signer of agent 'test' in 'XXX.nsf' [1540:0002-1544] 11.09.2009 08:45:40 AMgr: 'Agent 'test' in 'XXX.nsf' will run on behalf of 'ZZZ' [1540:0002-1544] 11.09.2009 08:45:40 AMgr: Agent 'test' in database 'XXX.nsf' signed by 'ZZZ' is running in Full Administrator mode Analysis on shared memory did not revealed anything out of ordinary. When the 'processUnCheckedDocumentsInPF' agent was disabled, the server did not hang but crashed instead. Investigation on the crash stack showed the server crashed with the following task: Date : Wed Oct 07 07:53:48 2009 Arguments : "C:\Programme\Lotus\Domino\nsd.exe" -dumpandkill -termstatus 5 -crashpid 2816 -crashtid 2748 NSD Version : 7.0.23.8057 (Release 7.0.2FP3) OS Version : Windows/2003 5.2 [32-bit] (Build 3790), PlatID=2, Service Pack 2 (8 Processors) Domino Version : Release 7.0.2FP3 (32-bit server) ############################################################ ### FATAL THREAD 51/86 [ nserver: 0b00: 0abc] ### FP=0x08e0efd4, PC=0x77530950, SP=0x08e0efb0 ### stkbase=08e10000, total stksize=262144, used stksize=4176 ### EAX=0x000004a5, EBX=0x00000000, ECX=0x00000000, EDX=0x776166a8 ### ESI=0x00000000, EDI=0x00000000, CS=0x0000001b, SS=0x00000023 ### DS=0x00000023, ES=0x00000023, FS=0x0000003b, GS=0x00000000 Flags=0x00010246 Exception code: c0000005 (ACCESS_VIOLATION) ############################################################ [ 1] 0x77530950 ole32.CoGetComCatalog+8405 (af68c,af688,af688,af688) [ 2] 0x77530d2a ole32.CoGetComCatalog+9391 (8e0f014,774fbbec,af688,4) [ 3] 0x775217da ole32.CoRegisterClassObject+1732 (8e0f084,d0ed0,8e0f118,8007000e) [ 4] 0x77521786 ole32.CoRegisterClassObject+1648 (8e0f0a0,6333b068,d0ed0,0) [ 5] 0x775211a3 ole32.CoRegisterClassObject+141 (8e0f0e0,d0ed0,4,1) @[ 6] 0x600fd817 nnotes.LSRcAdapterRegistry::OleCreateAdtClassFactory+295 (26c5a9b0,d0ed0,4,8e0f358) @[ 7] 0x63201724 nlsxbe.LsxMsgProc@12+404 (10004,17a3e04c,6333b068) @[ 8] 0x6013c8b3 nnotes.DLLNode::Register+35 (26943788,b2786d4,b270000) @[ 9] 0x6013c54a nnotes.LSIClassRegModule::AddLibrary+170 (8e0f444,8e0f430,8e0f428) @[10] 0x6013c023 nnotes.LSISession::RegisterClassLibrary+19 (b2786d4,8e0f444,b275114) @[11] 0x6013bfad nnotes.LSISession::RegisterClassLibrary+173 (b2786d4,8e0f444,b275114) @[12] 0x60a13a8c nnotes.LSCreateScriptSession@20+124 (b26ca98,60bfe278,60b4c8e8,0,0) @[13] 0x60a13afa nnotes.LSLotusScriptInit@4+26 (b26ca98) @[14] 0x60a0d9bd nnotes.CLSIDocument::Init+29 (b26ca94) @[15] 0x605b0688 nnotes.AgentRun@16+296 (b27f814,b26ca94,0,10) @[16] 0x10059506 nserverl.ServerRunServerAgent@8+438 (11340010,74ac003b) @[17] 0x1001dd72 nserverl.DbServer@8+2354 (6e740138,11340010) @[18] 0x100314cd nserverl.WorkThreadTask@8+1485 (4,0) @[19] 0x100018a8 nserverl.Scheduler@4+744 (0) @[20] 0x60104370 nnotes.ThreadWrapper@4+208 (0) [21] 0x7c82482f kernel32.GetModuleHandleA+223 Once again "XXX.nsf" was involved and agent "test1" was somehow involved. However investigations on console.log showed that "test1" agent was run successfully earlier. Although the two server outages are different (one server hangs and the other a server crash), there are some similarities in the function calls which eventually leads to the server outages reported. Further analysis showed that the crash above may be a match to SPR# JSSI7PD47Q - Agent running in server cause the server crash. However the SPR has since been closed as no plans to fix in Domino 6, 7 and not reproducible in Domino 8.
Local fix
Problem summary
Problem conclusion
Temporary fix
Comments
This APAR is associated with SPR# BKYP7WUDQG. If this issue occurs in a later release, a fix will be investigated
APAR Information
APAR number
LO45391
Reported component name
DOMINO SERVER
Reported component ID
5724E6200
Reported release
700
Status
CLOSED FIN
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2009-10-15
Closed date
2009-10-24
Last modified date
2009-10-24
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Modules/Macros
NA
Fix information
Applicable component levels
R700 PSN
UP
[{"Business Unit":{"code":"BU055","label":"Cognitive Applications"},"Product":{"code":"SSKTMJ","label":"Lotus Domino"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.0","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
24 October 2009