Db2 hangs during distributed processing

If a user or application "hangs" during distributed processing, the condition that causes the problem can be at the requesting (local) system, the serving (remote) system, or in the network (VTAM®, TCP/IP, CTC, or NCP).

At a Db2 system, the command DISPLAY THREAD DETAIL can be used to get information about distributed threads. This command should be issued at both requesting and serving locations to get the status of the distributed thread.

The command -DISPLAY THREAD (*) TYPE(SYSTEM) can be used to identify the tokens of DB commands and system agents that can be canceled. After the tokens are identified, you can use the -CANCEL THREAD (token) command to cancel DB commands and system agents that are in progress.

Canceling a distributed thread that is hung in VTAM

When your network connection is through VTAM, a distributed thread can be:
  • Active in VTAM
  • Waiting in Db2 for VTAM notification that a particular event is completed
  • Not in VTAM, not in Db2 (for example, waiting for user input).

If the distributed thread is not active in VTAM, it can be canceled by using the CANCEL DDF THREAD command. A distributed thread should be canceled at the requesting location. In this case, both the allied thread (at the requesting location) and the database access threads (at the serving location) are terminated with an SVC dump. If the distributed thread is canceled at the serving site, only the database access thread is terminated with a dump.

If a distributed thread is hung in VTAM, the following VTAM command can be used to terminate the session:

           V NET,TERM,SID=SESSION-ID

To terminate a session, the session identifier must be known. The following procedure can be used to find the session identifier and terminate the thread:

  1. Use the Db2 command DISPLAY THREAD LOCATION DETAIL to identify the hung thread. Get the session identifier (field SESSID).

    A Db2 command DISPLAY THREAD LOCATION DETAIL is used to find the hung thread. In this example, it is authorization ID SYSOPR by using Db2 plan CAN1. The session identifier is 'F0EF951D7B824660':

    Figure 1. Session identifier
    ?DIS THD(*) LOC(*) DET
     DSNV401I ? DISPLAY THREAD REPORT FOLLOWS -
     DSNV402I ? ACTIVE THREADS - 056
     NAME     ST A   REQ ID           AUTHID   PLAN     ASID    TOKEN
     TSO      TR     178 SYSOPR       SYSOPR   CAN1     0012        3
      -IMSNET.LUDBD2.A0FFF131B239=3 ACCESSING DATA AT
      -SYDNEY
      --LOCATION         SESSID           A ST TIME
      --SYDNEY           F0EF951D7B824660   S  8927509332750
     DISPLAY ACTIVE REPORT COMPLETE
     DSN9022I ? DSNVDT '?DIS THD' NORMAL COMPLETION
  2. Use the VTAM command D NET,ID=luname,SCOPE=ALL: Take the last 7 bytes of the session identifier that is provided by the DISPLAY THREAD DETAIL command from step 1 to correlate the VTAM session identifier (field SID). In the following figure, the VTAM session identifier is 'E2EF951D7B824660'.
    Figure 2. Session identifier
    D NET,ID=LUDBD2,SCOPE=ALL
     IST097I DISPLAY ACCEPTED
     IST075I NAME = LUDBD2, TYPE = APPL
     IST486I STATUS= ACTIV, DESIRED STATE= ACTIV
     IST861I MODETAB=DB2MODES USSTAB=***NA*** LOGTAB=***NA***
     IST934I DLOGMOD=***NA***
     IST597I CAPABILITY-PLU ENABLED  ,SLU ENABLED  ,SESSION LIMIT NONE
     IST654I I/O TRACE = OFF, BUFFER TRACE = OFF
     IST271I JOBNAME = DBD2DIST, STEPNAME = DBD2DIST
     IST171I ACTIVE SESSIONS = 0000000004, SESSION REQUESTS = 0000000000
     IST206I SESSIONS:
     IST634I NAME     STATUS         SID          SEND RECV VR TP NETID
     IST635I LUDBD1   ACTIV-S    E2EF951D7B824660 000A 0010  0  0 IMSNET
     IST635I LUDBD1   ACTIV-S    E2EF951D7B82465F 001D 0000  0  0 IMSNET
     IST635I LUDBD1   ACTIV-P    E3EF951D7C824D69 0000 000F  0  1 IMSNET
     IST635I LUDBD1   ACTIV-P    E3EF951D7C824D68 0002 0002  0  1 IMSNET
     IST314I END
  3. Use the session identifier that is provided by the VTAM DISPLAY command from step 2 to terminate the session by way of the VTAM TERMINATE command V NET,TERM,SID=session-id.
    Figure 3. Session identifier
    V NET,TERM,SID=E2EF951D7B824660
     IST097I VARY ACCEPTED
     MSG0: PLEASE STAND BY .....
     IST455I SID=E2EF951D7B824660 SESSIONS ENDED

After the session is terminated, APPC primary/secondary return codes of RCPRI=0048, RCSEC=0000 are returned to Db2. This combination of RCPRI and RCSEC indicates that the conversation was terminated because the session, which was used by the conversation, was terminated. This combination of RCPRI/RCSEC is called "Resource failure, no retry". At the remote site (subsystem recognition character "!"), APPC primary/secondary return codes of RCPRI=004C, RCSEC=0000 are returned to Db2. This combination is called "Resource failure, retry".

Canceling a distributed thread with a TCP/IP connection

When your network connection is through TCP/IP, a distributed thread can be:
  • Active in TCP/IP
  • Waiting in Db2 for TCP/IP notification that a particular event is completed
  • Not in TCP/IP, not in Db2 (for example, waiting for user input)

A distributed thread that uses a TCP/IP connection can be canceled by using the Db2 command CANCEL DDF THREAD. Use the Db2 command DISPLAY THREAD DETAIL to identify the hung thread. A distributed thread should be canceled at the requesting location. In this case, both the allied thread (at the requesting location) and the database access threads (at the serving location) are terminated with an SVC dump. If the distributed thread is canceled at the serving site, only the database access thread is terminated with a dump.