Db2 hangs during distributed processing
If a user or application "hangs" during distributed processing, the condition that causes the problem can be at the requesting (local) system, the serving (remote) system, or in the network (VTAM®, TCP/IP, CTC, or NCP).
At a Db2 system, the command DISPLAY THREAD DETAIL can be used to get information about distributed threads. This command should be issued at both requesting and serving locations to get the status of the distributed thread.
The command -DISPLAY THREAD (*) TYPE(SYSTEM) can be used to identify the tokens of DB commands and system agents that can be canceled. After the tokens are identified, you can use the -CANCEL THREAD (token) command to cancel DB commands and system agents that are in progress.
Canceling a distributed thread that is hung in VTAM
When your network connection is through VTAM, a distributed thread can be:- Active in VTAM
- Waiting in Db2 for VTAM notification that a particular event is completed
- Not in VTAM, not in Db2 (for example, waiting for user input).
If the distributed thread is not active in VTAM, it can be canceled by using the CANCEL DDF THREAD command. A distributed thread should be canceled at the requesting location. In this case, both the allied thread (at the requesting location) and the database access threads (at the serving location) are terminated with an SVC dump. If the distributed thread is canceled at the serving site, only the database access thread is terminated with a dump.
If a distributed thread is hung in VTAM, the following VTAM command can be used to terminate the session:
V NET,TERM,SID=SESSION-ID
To terminate a session, the session identifier must be known. The following procedure can be used to find the session identifier and terminate the thread:
- Use the Db2 command DISPLAY
THREAD LOCATION DETAIL to identify the hung thread. Get the session
identifier (field SESSID).
A Db2 command DISPLAY THREAD LOCATION DETAIL is used to find the hung thread. In this example, it is authorization ID SYSOPR by using Db2 plan CAN1. The session identifier is 'F0EF951D7B824660':
Figure 1. Session identifier ?DIS THD(*) LOC(*) DET DSNV401I ? DISPLAY THREAD REPORT FOLLOWS - DSNV402I ? ACTIVE THREADS - 056 NAME ST A REQ ID AUTHID PLAN ASID TOKEN TSO TR 178 SYSOPR SYSOPR CAN1 0012 3 -IMSNET.LUDBD2.A0FFF131B239=3 ACCESSING DATA AT -SYDNEY --LOCATION SESSID A ST TIME --SYDNEY F0EF951D7B824660 S 8927509332750 DISPLAY ACTIVE REPORT COMPLETE DSN9022I ? DSNVDT '?DIS THD' NORMAL COMPLETION
- Use the VTAM command
D NET,ID=luname,SCOPE=ALL:
Take the last 7 bytes of the session identifier that is provided by the DISPLAY THREAD DETAIL command from step 1 to correlate the VTAM session identifier (field SID). In the following figure, the VTAM session identifier is 'E2EF951D7B824660'.Figure 2. Session identifier D NET,ID=LUDBD2,SCOPE=ALL IST097I DISPLAY ACCEPTED IST075I NAME = LUDBD2, TYPE = APPL IST486I STATUS= ACTIV, DESIRED STATE= ACTIV IST861I MODETAB=DB2MODES USSTAB=***NA*** LOGTAB=***NA*** IST934I DLOGMOD=***NA*** IST597I CAPABILITY-PLU ENABLED ,SLU ENABLED ,SESSION LIMIT NONE IST654I I/O TRACE = OFF, BUFFER TRACE = OFF IST271I JOBNAME = DBD2DIST, STEPNAME = DBD2DIST IST171I ACTIVE SESSIONS = 0000000004, SESSION REQUESTS = 0000000000 IST206I SESSIONS: IST634I NAME STATUS SID SEND RECV VR TP NETID IST635I LUDBD1 ACTIV-S E2EF951D7B824660 000A 0010 0 0 IMSNET IST635I LUDBD1 ACTIV-S E2EF951D7B82465F 001D 0000 0 0 IMSNET IST635I LUDBD1 ACTIV-P E3EF951D7C824D69 0000 000F 0 1 IMSNET IST635I LUDBD1 ACTIV-P E3EF951D7C824D68 0002 0002 0 1 IMSNET IST314I END
- Use the session identifier that is provided by the VTAM DISPLAY command from step 2 to terminate
the session by way of the VTAM TERMINATE
command
V NET,TERM,SID=session-id
.Figure 3. Session identifier V NET,TERM,SID=E2EF951D7B824660 IST097I VARY ACCEPTED MSG0: PLEASE STAND BY ..... IST455I SID=E2EF951D7B824660 SESSIONS ENDED
After the session is terminated, APPC primary/secondary return codes of RCPRI=0048, RCSEC=0000 are returned to Db2. This combination of RCPRI and RCSEC indicates that the conversation was terminated because the session, which was used by the conversation, was terminated. This combination of RCPRI/RCSEC is called "Resource failure, no retry". At the remote site (subsystem recognition character "!"), APPC primary/secondary return codes of RCPRI=004C, RCSEC=0000 are returned to Db2. This combination is called "Resource failure, retry".
Canceling a distributed thread with a TCP/IP connection
When your network connection is through TCP/IP, a distributed thread can be:- Active in TCP/IP
- Waiting in Db2 for TCP/IP notification that a particular event is completed
- Not in TCP/IP, not in Db2 (for example, waiting for user input)
A distributed thread that uses a TCP/IP connection can be canceled by using the Db2 command CANCEL DDF THREAD. Use the Db2 command DISPLAY THREAD DETAIL to identify the hung thread. A distributed thread should be canceled at the requesting location. In this case, both the allied thread (at the requesting location) and the database access threads (at the serving location) are terminated with an SVC dump. If the distributed thread is canceled at the serving site, only the database access thread is terminated with a dump.