In each column, The Support Authority discusses resources, tools, and other elements of IBM® Technical Support that are available for WebSphere® products, plus techniques and new ideas that can further enhance your IBM support experience.
As always, we begin with some new items of interest for the WebSphere community at large:
- Are you ready for IMPACT 2010? Register early and save on registration fees and hotel accommodations. IMPACT 2010 is the premier conference for business and IT leaders. Join us in Las Vegas, May 2 through 7, and learn to work smarter from the most experienced business and technology leaders in the world.
- Have you tried the IBM Support Portal yet? All IBM software products are now included, and all software product support pages have been replaced by IBM Support Portal. See the Support Authority's Introduction to the new IBM Support Portal for details. Be sure to let us know what you think by sending your comments and suggestions to spe@us.ibm.com.
- Catch the replays of the January Electronic Support Webcast series at the Global WebSphere Community at websphere.org.
- There are several exciting webcasts planned in February at the WebSphere Technical Exchange. Check the site for details and become a fan on Facebook!
Continue to monitor the various support-related Web sites, as well as this column, for news about other tools as we encounter them.
And now, on to our main topic...
Developing problem determination expertise in Java™ EE environment can take years of real-world troubleshooting experience, even if you’re highly skilled in the technology. Knowledge is, of course, necessary, but problem determination skills grow with practice over time. The Problem Diagnostics Lab Toolkit can help shorten that learning curve by letting you experiment with common problem scenarios. This article introduces you to this new toolkit and shows how you can use its scenario-based approach to learning Java troubleshooting techniques by example and experimentation.
What you can learn from the toolkit
The Problem Diagnostic Lab Toolkit (PDTK) helps technical teams by reproducing various common problems, monitoring the impacts of different actions, and investigating the problems. The toolkit can enable system administrators to better understand symptoms of certain problems, thereby accelerating the process of problem resolution. By using the toolkit, developers can gain insight on the impacts of not following better coding practices.
Using examples, the PDTK shows you how to troubleshoot a wide variety of problems that can occur in Java applications deployed to WebSphere products. Examples include:
- Memory management problems.
- Excessive CPU usage.
- Thread deadlocks.
- JVM crashes.
The PDTK consists of several modules (Figure 1):
- Code editor: A "hot" Java code editor that enables you to edit Java code and invoke it from a browser without redeployment.
- Monitor: An integrated monitor helps you observe current system status, including thread status, memory usage, and average response times.
- Stress engine: This built-in engine can simulate several clients sending concurrent requests, and also provides a data facility to generate a variety of dump files, which can be used to diagnose certain types of problems.
- Management: A data facility to generate a variety of dump files, which can be used to diagnose certain types of problems.
Figure 1. PDTK modules
PDTK is an enterprise application, and needs to be deployed in a WebSphere Application Server environment. You only need to apply the default configuration; no extra resources or environment variables are needed. Follow these basic steps to install the toolkit:
- Download PDTK.
- Extract the EAR file from the compressed (.zip) archive.
- Start WebSphere Application Server and open the administrative console.
- Select Applications > New Application.
- Install the EAR file with the default configuration.
When the installation is complete, you can start the application and launch the toolkit by accessing http://hostname:port/LabToolkit in a Web browser. A panel similar to Figure 2 should display.
Figure 2. PDTK GUI
Figure 2 shows seven areas of the toolkit’s main GUI panel:
- The Problems pane shows the problem categories that are used to classify scenarios.
- A scenario represents a situation that might cause a problem to occur. For example, because a hung thread might occur when a wait leak, excessive synchronization, or a deadlock scenario takes place, the hung thread problem consists of three scenarios. When selecting a problem category, all the experimental scenarios that belong to the problem category will be shown in the Scenario list.
- Each scenario contains a wizard guide and an Action Buttons pane. The wizard walks you through a scenario step by step, and the Action Pane helps you edit and invoke Java code via action buttons.
- The Monitors pane lets you monitor system status.
- The message Console shows the log entries of the actions.
To walk you through the process of using the PDTK, let's look at a deadlock scenario.
Select ThreadHang from the Problems pane on the left, and then choose Dead Lock from the Scenarios list. This will cause both the scenario guide (Figure 3) and the action pane (Figure 4) to display.
Figure 3. Scenario guide
The wizard can assist with the walkthrough of the scenario. As shown in Figure 3, the steps are:
- Instruction: Overview of the problem that will be reproduced in the scenario.
- Reproduction: Describe the scenario procedures and tips.
- Investigation: Guide users to process the problem diagnosis.
- Summary: Summarize the problem.
You can also add or remove steps, or even change their content, via the drop-down menu from the wizard pane. The drop-down menu options include:
- Remove step
- New step
- Edit step.
Figure 4. Action pane
As shown in Figure 4, there are two action buttons in the action pane generated for deadlock scenario: DeadLock Jsp and Correct Jsp. As was the case for the wizard pane, the dropdown menu for the action pane contains buttons for:
- Remove action
- New action
- Edit action.
To review or edit the deadlock Java code, right click on the DeadLock Jsp button and select the Edit Action button. The code is shown in Listing 1 and Figure 5.
Listing 1. Java code executed by button DeadLock Jsp
synchronized (lock1) { // lock1 is defined in the "Methods and Static Variables" tab
Thread.sleep(5000);
ThreadMonitor.registerThreadStatus("blocked"); //It will be blocked here if the
//thread can not get the lock2
synchronized (lock2) {
ThreadMonitor.registerThreadStatus("running"); //It will continue to run if the
//thread can get the lock2
}
}
synchronized (lock2) { // lock2 is defined in the "Methods and Static Variables" tab
Thread.sleep(5000);
ThreadMonitor.registerThreadStatus("blocked"); //It will be blocked here if the
//thread can not get the lock1
synchronized (lock1) {
ThreadMonitor.registerThreadStatus("running"); //It will continue to run if the
//thread can get the lock1
}
} |
The code in Listing 1 performs these actions:
- Obtain a global lock: lock1.
- Sleep for 5 seconds.
- Obtain another global lock: lock2.
- Release global lock: lock1.
- Release global lock: lock2.
- Obtain a global lock: lock2.
- Sleep for 5 seconds
- Obtain a global lock: lock1.
- Release global lock: lock1.
- Release global lock: lock1.
This code segment can be run safely in a single-threaded environment; however, it causes a deadlock in a multi-threaded environment. When two different threads are run individually right before step 2 and step 6, one of them has already occupied lock1 and waits for lock2, and vice-versa. Hence, if you simulate multiple clients running this code simultaneously, then the deadlock problem is recreated.
Figure 5. Code editor, as a result of Edit Action button
PDTK has a built-in stress engine to easily simulate concurrent access
scenarios. Figure 6 shows how to set up the stress engine by expanding the
Advanced Settings pane and configuring the Client number,
Invoke times,
and Think time (time between requests). In this case, in order to
reproduce the deadlock situation, set the number of clients to 2. After
configuring the advanced settings, expand the Action Buttons pane, and
click the DeadLock Jsp button. The stress engine will simulate two
clients to send simultaneous requests to the Dead Lock JSP.
Figure 6. Set up stress engine
As shown in Figure 7, expand the Monitors pane to see three tabs: Thread, Memory, and ResponseTime. Click on the Thread tab to get the status of the threads. From the thread information shown in Figure 7, the status of both threads is blocked. The deadlock situation means that neither thread can be terminated programmatically, and other threads will be affected as well. When the total number of threads becomes larger than the maximum threads in the Web container, then all new requests are rejected.
Figure 7. Monitoring thread status
As was the case for the DeadLock JSP, the same can now be done for the Corrected JSP; right-click on the Corrected Jsp button and select Edit Action in the dropdown menu. The result is the Java code as shown in Listing 2.
Listing 2. Java code executed by Correct Jsp button
synchronized (lock1) { // lock1 is defined in the "Methods and Static Variables" tab
Thread.sleep(5000);
ThreadMonitor.registerThreadStatus("blocked");
synchronized (lock2) {
ThreadMonitor.registerThreadStatus("running");
}
}
synchronized (lock1) { // lock2 is defined in the "Methods and Static Variables" tab
Thread.sleep(5000);
ThreadMonitor.registerThreadStatus("blocked");
synchronized (lock2) {
ThreadMonitor.registerThreadStatus("running");
}
} |
This code performs these actions:
- Obtain a global lock: lock1.
- Sleep for 5 seconds.
- Obtain another global lock: lock2.
- Release global lock: lock2.
- Release global lock: lock1.
- Obtain an global lock: lock1.
- Sleep for 5 seconds.
- Obtain a global lock: lock2.
- Release global lock: lock2.
- Release global lock: lock1.
The only change here from the first list is that the nesting order of lock1 and lock2 has been rearranged. However, when concurrent requests are sent to this page, all threads end normally. Therefore, in a multi-threaded environment, you must ensure the correct order of nested locks in order to avoid deadlocks.
This article illustrated how the Problem Diagnostics Lab Toolkit can help troubleshoot a thread deadlock problem. Besides deadlocks, PDTK also helps troubleshoot several additional common problems, such as memory leaks, extreme CPU usage, JVM crashes, and so on. By providing an environment that enables you to experiment with common problem scenarios, the PDTK can help you build problem determination skills by simulating real-world situations.
Thanks to Russell Wright for his insightful review comments and suggestion, some of which have been directly incorporated into this article.
Learn
-
The Support Authority: If you need help with WebSphere products, there are many ways to get it
-
IBM Software product
Information Centers
-
IBM Software Support Web site
-
IBM Education Assistant
-
IBM developerWorks
-
IBM Redbooks
Get products and technologies
-
Problem Diagnostics Lab Toolkit
-
IBM
developer kits
-
IBM Software Support Toolbar
-
IBM Support Assistant
Discuss
-
Forums and newsgroups
-
Java
technology Forums
-
WebSphere
Support Technical Exchange on Facebook
-
Global WebSphere Community on WebSphere.org
- Follow IBM Support on Twitter!
- WebSphere Electronic Support
- WebSphere Application Server information
- WebSphere Process Server
- WebSphere MQ
- WebSphere Business Process Management
- WebSphere Business Modeler
- WebSphere Adapters
- WebSphere DataPower Appliances
- WebSphere Commerce
- IBM Support Assistant Tools
Peng Fei Sui (Peter) is a software engineer at the IBM China Software Development Lab in Beijing, China. He has been a part of the China WAS SVT team for 3 years working on WebSphere Application Server Serviceability SVT. His current efforts are focused on WebSphere Application Server problem determination and he is an active participant in local customer support.
Dr. Mahesh Rathi has been involved with WebSphere Application Server product since its inception. He led the security development team before joining the L2 Support team, and joined the SWAT team in 2005. He thoroughly enjoys working with demanding customers, on hot issues, and thrives in pressure situations. He received his PhD in Computer Sciences from Purdue University and taught Software Engineering at Wichita State University before joining IBM.
Hao Li (Nico) is a member of the WebSphere Application Server China team at IBM China Development Lab, Beijing. His current focus is on automated testing with Rational Functional Tester. WebSphere Application Server on z/OS testing is another interest of him.
Yiwen Huang is an advisory software engineer at the China development lab. She works in WebSphere Application Server system verification test, and leads the WebSphere Application Server Hypervisor edition testing effort. Prior to join the China development lab in 2008, Yiwen worked at the Toronto development lab as WebSphere Application Server L3 support since 2003.




