APAR status
Closed as duplicate of another APAR.
Error description
Problem Description: TEPS looses communication with the HUB TEMS. Timeouts occur of duration 900 seconds or greater as viewed in the RAS1 log: +4627F7A4.0000 extend: 372 duration: 931 state: 1 These RPCs are executing the CTDS_DestroyRequest() RPC, Interface: UUID 684152a852f9.02.c6.d2.2d.fd.00.00.00, opnum: 7. State capture reveals threads blocked on fflush. Detailed Recreation Procedure: Very difficult to reproduce: this is timing and load dependent. Related Files and Output: RAS1 log will show the SAR state summary with an extended duratiion of 900+ seconds and UUID / opnum indicating CTDS_DestroyRequest() Logs provided: ============================================ During our daily call we saw TEPS disconnect and than we were not connecting back. but issue had occurred much earlier. The problem was at the Hub_TEMS and it was stuck in a loop. TEPS logs indicated a rpc_sar error (2007.109 12.23.37-38:kdcc1sr.c,1000,"rpc__sar") Remote call failure: 1C010001 +2007.109 12.23.37 activity: c3e147f8920f.42.53.00.32.41.81.27.17 started: 4627BE17 +2007.109 12.23.37 interface: 684152a852f9.02.c6.d2.2d.fd.00.00.00 version: 121 +2007.109 12.23.37 object: 000000000000.00.00.00.00.00.00.00.00 opnum: 7 +2007.109 12.23.37 srvr-boot: 4624E186 length: 16 a/i-hints: F923/0005 +2007.109 12.23.37 interval: 15 pkts-in: 30 retries: 0 +2007.109 12.23.37 pings: 31 no-calls: 0 working: 30 +2007.109 12.23.37 facks: 0 mtu: 944 sequence: 2 +2007.109 12.23.37 b-size: 32 b-fail: 0 b-hist: 0 +2007.109 12.23.37 nextfrag: 0 fragnum: 0 timeouts: 31 +2007.109 12.23.37 idem: false maybe: false large: false +2007.109 12.23.37 callback: false snd-frags: false rcv-frags: false +2007.109 12.23.37 extend: 372 duration: 930 state: 1 +2007.109 12.23.37 bld-date: Mar 9 2007 bld-time: 18:12:12 revision: 1.1.1.2 +2007.109 12.23.37 bsn: 3718534 bsq: 4 driver: d7068a +2007.109 12.23.37 short: 10 contact: 60 reply: 240 +2007.109 12.23.37 req-int: 30 frag-int: 30 ping-int: 15 +2007.109 12.23.37 limit: 900 work-allow: 30 +2007.109 12.23.37 loc-endpt: ip.spipe:#*[7759] +2007.109 12.23.37 rmt-endpt: ip.spipe:#129.39.23.80[3660] (2007.109 12.23.37-38:kdsnccns.c,260,"NCSErrorMessage") CT/DS RPC Error: DSR010 - CTDS_DestroyRequest RPC abend (2007.109 12.23.37-38:kdsnccns.c,59,"ConvertNCSStatus") NCS Status Code: 1c010001 (2007.109 12.23.37-38:kdsnccns.c,209,"ConvertNCSStatus") CT/DS status returned: 155 Indicating comm failure between Hub and TEPS. We recycled the TEPS but failed again with same error indicating that the TEMS was not accepting new responses. Here are two consecutive rpc errors after 932secs 4627F7A4.0000-1:kdcc1sr.c,1000,"rpc__sar") Remote call failure: 1C010001 +4627F7A4.0000 activity: c3e215e6c4a5.42.53.00.33.41.81.27.17 started: 4627F401 +4627F7A4.0000 interface: 684152a852f9.02.c6.d2.2d.fd.00.00.00 version: 121 +4627F7A4.0000 object: 000000000000.00.00.00.00.00.00.00.00 opnum: 7 +4627F7A4.0000 srvr-boot: 4624E186 length: 16 a/i-hints: F8BD/0005 +4627F7A4.0000 interval: 15 pkts-in: 30 retries: 0 +4627F7A4.0000 pings: 31 no-calls: 0 working: 30 +4627F7A4.0000 facks: 0 mtu: 944 sequence: 4 +4627F7A4.0000 b-size: 32 b-fail: 0 b-hist: 0 +4627F7A4.0000 nextfrag: 0 fragnum: 0 timeouts: 31 +4627F7A4.0000 idem: false maybe: false large: false +4627F7A4.0000 callback: false snd-frags: false rcv-frags: false +4627F7A4.0000 extend: 372 duration: 931 state: 1 +4627F7A4.0000 bld-date: Mar 9 2007 bld-time: 18:12:12 revision: 1.1.1.2 +4627F7A4.0000 bsn: 3718534 bsq: 4 driver: d7068a +4627F7A4.0000 short: 10 contact: 60 reply: 240 +4627F7A4.0000 req-int: 30 frag-int: 30 ping-int: 15 +4627F7A4.0000 limit: 900 work-allow: 30 +4627F7A4.0000 loc-endpt: ip.spipe:#*[7758] +4627F7A4.0000 rmt-endpt: ip.spipe:#129.39.23.80[3660] (4627F7A4.0001-1:kdsnccns.c,260,"NCSErrorMessage") CT/DS RPC Error: DSR010 - CTDS_DestroyRequest RPC abend (4627F7A4.0002-1:kdsnccns.c,59,"ConvertNCSStatus") NCS Status Code: 1c010001 (462800CC.0000-6:kdcc1sr.c,1000,"rpc__sar") Remote call failure: 1C010001 +462800CC.0000 activity: c3e216897f00.42.53.00.33.41.81.27.17 started: 4627FF64 +462800CC.0000 interface: 684152a852f9.02.c6.d2.2d.fd.00.00.00 version: 121 +462800CC.0000 object: 000000000000.00.00.00.00.00.00.00.00 opnum: 3 +462800CC.0000 srvr-boot: 4624E186 length: 364 a/i-hints: FFFF/FFFF +462800CC.0000 interval: 15 pkts-in: 11 retries: 11 +462800CC.0000 pings: 12 no-calls: 11 working: 0 +462800CC.0000 facks: 0 mtu: 944 sequence: 13 +462800CC.0000 b-size: 32 b-fail: 0 b-hist: 0 +462800CC.0000 nextfrag: 0 fragnum: 0 timeouts: 12 +462800CC.0000 idem: false maybe: false large: false +462800CC.0000 callback: false snd-frags: false rcv-frags: false +462800CC.0000 extend: 0 duration: 360 state: 1 +462800CC.0000 bld-date: Mar 9 2007 bld-time: 18:12:12 revision: 1.1.1.2 +462800CC.0000 bsn: 3718534 bsq: 4 driver: d7068a +462800CC.0000 short: 10 contact: 60 reply: 240 +462800CC.0000 req-int: 30 frag-int: 30 ping-int: 15 +462800CC.0000 limit: 900 work-allow: 30 +462800CC.0000 loc-endpt: ip.spipe:#*[7759] +462800CC.0000 rmt-endpt: ip.spipe:#129.39.23.80[3660] (462800CC.0001-6:kdsnccns.c,260,"NCSErrorMessage") CT/DS RPC Error: DSR034 - CTDS_CreateRequest RPC abend (462800CC.0002-6:kdsnccns.c,59,"ConvertNCSStatus") NCS Status Code: 1c010001 (462800CC.0003-6:kdsnccns.c,209,"ConvertNCSStatus") CT/DS status returned: 155 We did a gencore on kdsmain process on the Hub and analyzed it thru dbx. we found thread 130 in hung state (dbx) where write.write(??, ??, ??) at 0xd033b988 flsbuf._xwrite(??, ??, ??, ??) at 0xd0332cd8 flsbuf._xflsbuf(??) at 0xd0332c10 flsbuf.fflush_unlocked(??) at 0xd0332478 flsbuf.fflush(??) at 0xd03330ac KBBRAFH(0x33e47278, 0x43431cad, 0x2) at 0x2018c244 KBBSS_FlushBuffer(0x43431ba8, 0x1) at 0x20182a74 BSS1_EndFormat(0x43431ba8) at 0x201890ac RAS1_Format(0x305229b0, 0x283, 0x3049c810, 0x409c0d84) at 0x2017e840 RAS1_Printf(0x305229b0, 0x283, 0x3049c810, 0x4c0cfec4, 0x14c, 0x32cc8948, 0x101dc0b1, 0x43477ce0) at 0x2017e7a4 Index_GetNextLocate(0x409c0fd8, 0x409c0fcc, 0x409c0fd0) at 0x3047fe20 KFAUS_PositionEqualFromBeginning(0x409c0fd8, 0x409c11e8, 0x64, 0x409c0fcc, 0x409c0fd0) at 0x3047dfa8 KFAUS_RetrieveUserIndexEntries(0x409c14d8, 0x154, 0x409c1378, 0x409c1348, 0x304a785c, 0x1, 0x1, 0x409c11e8) at 0x3047c8c0 kfastsal(0x4b6a3b58, 0x4b657d77, 0x4ae47c98, 0x0, 0x0, 0x0) at 0x30492f20 kfastpst(0x4aa78c18, 0x4b657d77, 0x4ae47c98, 0x4ae47c68, 0x55000055) at 0x30490d98 kfastalr(0x4aa78c18, 0x4b640ad4, 0x4, 0x4aa78e90, 0x20, 0x0, 0x4b657d77, 0x0) at 0x30492658 kfaatloc.Process(0xae900000, 0x4b203868) at 0x3044b660 VST11PT(0xae810000, 0x4b18b9e8) at 0x1009db40 VVW11_ManageView(0x4bd90978) at 0x10035ef4 ThreadManager(0x38c44e98) at 0x2007fcdc (dbx) 0x4b18b9e8 / 100 c 0x4b18b9e8: 'V' 'S' 'T' 'O' 'ᄅ' '' '\0' '\0' 'K' '^X' '"' '^T' '0' 'T' '^U' '￐' 0x4b18b9f8: 'K' '^X' '' 'ハ' 'K' '^X' '"' 'D' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' 0x4b18ba08: 'ハ' '^F' '' '\0' 'Q' 'A' '1' 'C' 'S' 'I' 'T' 'F' ' ' ' ' ' ' '\0' 0x4b18ba18: '\0' '\0' '\0' '\0' '\0' 'T' 'S' 'I' 'T' 'D' 'E' 'S' 'C' '\0' '\0' '\0' 0x4b18ba28: 'O' '4' 'S' 'R' 'V' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' 0x4b18ba38: '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' 0x4b18ba48: ' ' '￘' '\0' '\0' ==================================
Local fix
no workaround available
Problem summary
Problem conclusion
Temporary fix
Comments
This APAR is a duplicate of IY93582
APAR Information
APAR number
IY97698
Reported component name
TEMS
Reported component ID
5724C04MS
Reported release
610
Status
CLOSED DUB
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2007-04-23
Closed date
2007-10-27
Last modified date
2007-10-27
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSCTLMP","label":"ITM Tivoli Enterprise Mgmt Server V6"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"610","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
27 October 2007