IBM Support

LO63255: CUSTOMERS DOMINO SERVER GENERATES AN NSD DURING SHUTDOWN DUE TO NSF MONITORS CAUSING NROUTER TO NOT SHUTDOWN

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Customer shutsdown the Domino Server and the Domino Server tasks
    shutdown however the nRouter Task does not shutdown Completly
    which in turn results in an NSD -nomemcheck as the server did
    not shutdown in the time specified in the Server Document.
    
    In this case the server shutdown timeout was set to 5 minutes
    (300 Seconds)
    
    An investigation of the NSD confirms this is an nomemcheck
    crash by the following argument:
    
    "C:\Notes\nsd.exe" -dumpandkill -termstatus 1 -nomemcheck
    -shutdownhang -crashpid 1660 -crashtid 2740 -runtime 300
    
    1) The first thing that I do when analyzing an NSD is look at
    the Name of the Server, Date & Time, OS Version and Notes
    Version and since this is a shutdown hang we can confirm this by
    the following Argument as we see -nomemcheck:
    
    Host Name       : ABCD1234
    User Name       : SYSTEM
    Date            : Tue Aug 23 21:51:17 2011
    Windows Dir     : C:\WINNT
    "C:\Notes\nsd.exe" -dumpandkill -termstatus 1 -nomemcheck
    -shutdownhang -crashpid 1660 -crashtid 2740 -runtime 300
    NSD Version     : 8.5.15.0214 (Release 8.5.1FP5)
    OS Version      : Windows/2003 5.2 [64-bit] (Build 3790),
    PlatID=2, Service Pack 2 (4 Processors)
    Build time      : Thu Sep 30 03:03:05 2010
    Latest file mod : Tue Aug 03 10:57:08 2010
    Domino Version   : Release 8.5.1FP5 HF231 (64-bit server)
    
    2) Once we have verified the Build  I then search for "OS
    Process" to see what is running on the server, when the OS was
    rebooted and we can see when the server crashed outlined in RED,
    however this crash was generated as the server did not shutdown
    within the 5 minutes set in the Server Document. A Review of the
    OS Process Table the server is awaiting the nRouter Process to
    shutdown before the entire server can successfully shutdown.
    
    <@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
    Section: System Data -> OS Process Table (Time 21:51:43)
    <@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
    
    <@@ ------ System Data -> Processes (Time 21:51:43) ------ @@>
    
      INFO  PID       PPID      UID   STIME          COMMAND
    ["C:\Notes\nservice.exe"  "=C:\Notes\notes.ini":  057c]
         -> 067c 057c     0 08/02 00:01:11
    ["C:\Notes\nSERVER.EXE" =C:\Notes\notes.ini:  067c]
         -> 0440 067c     0 08/02 00:17:52
    [C:\Notes\nRouter.EXE :  0440]
            04d0 067c     0 08/23 21:51:16
    ["C:\Notes\nsd.exe" -dumpandkill -termstatus 1 -nomemcheck
    -shutdownhang -crashpid 1660 -crashtid 2740 -runtime 300:  04d0]
    
    3) We also see the following is outlined, confirming this was a
    shutdown monitor NSD:
    
    ############################################################
    ### thread 2/3: [ nSERVER:  067c:  0ab4]
    ### FP=0x6eb4ac78, PC=0x77ef02ea, SP=0x6eb4ac78
    ### stkbase=0x6eb50000, total stksize=4194304, used
    stksize=21384
    ############################################################
     [ 1] 0x77ef02ea ntdll.ZwWaitForSingleObject+10
    (12c,6eb4d9f0,6eb4dfb0,6eb4dfd0)
     [ 2] 0x77d704ff kernel32.WaitForSingleObjectEx+223
    (4e8,6eb4dfc8,0,0)
    @[ 3] 0x004adae7 nnotes.OSRunExternalScript+4151 (4,0,24982a4,3)
    @[ 4] 0x004a8091 nnotes.FRTerminateWindowsResources+2277
    (6E006F004D0020,72006F00740069,53006F0051007C,73006100540020)
    @[ 5] 0x004b0e16 nnotes.OSFaultCleanupExt+622 (0,467a26,0,0)
    @[ 6] 0x004b1829 nnotes.OSFaultCleanup+29
    (ad180004,0,EE00000000,f0102a95)
    @[ 7] 0x10006aaa nserverl.ShutdownMonitorTask+898
    (c6e4009c,0,0,1)
    @[ 8] 0x10001b21 nserverl.Scheduler+969 (0,0,0,0)
    @[ 9] 0x0044ff1e nnotes.ThreadWrapper+330 (0,0,0,6eb4ffa8)
     [10] 0x77d6b71a kernel32.BaseThreadStart+58 (0,0,0,0)
    
    4) Which is outlined in the following technote:
    
    Lotus Software Knowledge Base Document
    
    Title: How does the Shutdown Monitor work in Domino?
    Doc #: 1236058
    URL: http://www.ibm.com/support/docview.wss?uid=swg21236058
    
    5) However the cause of the long shutdown time was due to the
    nRouter process and a review of the stack below points us
    this known issue outlined in SPR# CMAS7VQHK4 which is addressed
    in 8.5.2 codestream,
    
    ############################################################
    ### thread 1/2: [ nRouter:  0440:  0c60]
    ### FP=0x00128528, PC=0x77ef030a, SP=0x00128528
    ### stkbase=0x00130000, total stksize=81920, used stksize=31448
    ############################################################
     [ 1] 0x77ef030a ntdll.ZwReadFile+10
    (6A300062109,2d8000ab,ffffffff,10052e60)
     [ 2] 0x77d6e4a6 kernel32.ReadFile+182 (128608,129394,42,128608)
    @[ 3] 0x1004a7ff nnotes.OSFDFileRead+47 (0,7b7ec1a8,42,1004a7ff)
    @[ 4] 0x116bf59b nnotes.FileReadOSFD+439 (4,0,128a7e48,128a7d80)
    @[ 5] 0x116c12dd nnotes.NSFFileRead+37 (0,ffffffff,0,12c850)
    @[ 6] 0x119817ff nnotes.ReadBDB+775
    (538d5020,54009,7b7ec1a8,12c850)
    @[ 7] 0x1197fde1 nnotes.DbBDBRead+1057 (12b950,54009,6,7b7ec1a8)
    @[ 8] 0x11919cbc nnotes.DbLoad+1976
    (6f809,2cbc009,12cb04,12cb04)
    @[ 9] 0x116df1f7 nnotes.NSFDbOpenExtended4+24463
    (2CBC00900000000,10052e60,5f640618,0)
    @[10] 0x116d82ab nnotes.NSFDbOpenExtended+127
    (f0247c96,1004dcfc,12fd1020,41c0018)
    @[11] 0x1171f81a nnotes.NSFDbOpen+34 (0,11219348,0,fcb0038)
    @[12] 0x11d89970 nnotes.MailDeleteDeliveryContext+52
    (0,455146,12d9ec,12de48)
    @[13] 0x004398e5 nRouter.RouterDbCacheTerm+117
    (e6a00003,29,3330,158)
    @[14] 0x0040214c nRouter.AddInMain+4428 (1,10125a1d,0,0)
    @[15] 0x0047616e nRouter.NotesMain+74 (40fac80,0,0,40b0000)
    @[16] 0x00476275 nRouter.main+245 (2,0,0,0)
    @[17] 0x00484a18 nRouter.mainCRTStartup+568 (0,0,0,0)
     [18] 0x77d596ac kernel32.BaseProcessStart+44 (4847e0,0,0,0)
    
    In the end what is happening is the router is shown to be
    de-registering NSF Monitors that have been registered for mail
    files that the router has delivered mail to. These monitors are
    the basis of user mail rules that exist in each mail file. This
    is unchanged since the initial support for mail rules, but we
    have found that the de-registration can take some time if there
    are a large number of databases that contain rules, as each
    database needs to be re-opened and closed to perform the
    de-registration. This is the reason it takes the server a long
    time to shutdown and the crash occurs as the nRouter process has
    not completly shutdown.
    

Local fix

  • NA
    

Problem summary

  • A programming error was found and will be corrected in a future
     release.
    

Problem conclusion

  • A programming error was found and will be corrected in a future
     release.
    

Temporary fix

Comments

  • This APAR is associated with SPR# CMAS7VQHK4.
    

APAR Information

  • APAR number

    LO63255

  • Reported component name

    DOMINO SERVER

  • Reported component ID

    5724E6200

  • Reported release

    851

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2011-08-24

  • Closed date

    2011-09-02

  • Last modified date

    2011-09-02

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    DOMINO SERVER

  • Fixed component ID

    5724E6200

Applicable component levels

  • R851 PSN

       UP

[{"Business Unit":{"code":"BU055","label":"Cognitive Applications"},"Product":{"code":"SSKTMJ","label":"Lotus Domino"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5.1","Edition":"","Line of Business":{"code":"","label":""}}]

Document Information

Modified date:
02 September 2011