About performance problems and "hangs"

Performance problems arise in many different situations. A hang is one type of performance problem in which users wait for a response for an indefinite period of time. Troubleshooting techniques for hangs are similar to the techniques you use for other performance problems.

Here are some examples of situations in which performance problems become evident:

Each of these situations differs subtly from the others. An important part of troubleshooting a problem is to clarify whether something is failing to meet expectations or is exceeding resource capacity. In some cases, both of these situations are true.

Hangs can be particularly difficult to troubleshoot because the symptoms often seem to match the symptoms of other problems. For example, if a user is waiting for a long time for a response from a query, that user might think that they system is hung. In many cases, the query might be extremely complex, and the system might also be heavily used at the time, so the system is not hung, but it is just very slow to respond. Also, during a severe system shut down, a significant buildup of activity can result in most or all commands seeming to hang.

In addition to characterizing the problem correctly in terms of what the symptoms are (such as slowness, or too much resource used), and where the symptoms are observed (in a query, application, system resource, and other sources), you need several other pieces of information to put the problem in context.

Answer the following questions to quickly determine the best place to start looking for the cause of the performance problem.

  1. When did the problem first occur?

    If the problem has been occurring for some time, and if a database monitor schedule has been implemented, you can use historical data to find differences. This will allow you to focus on changes in system behavior and then focus on why these changes were introduced. Use Proactive monitoring tools. You must also consider whether any recent changes occurred, such as hardware or software upgrades, a new application rollout, or additional users.

  2. Is the performance issue constant or intermittent?

    If the poor performance is continual, check to see if the system has started to handle a larger workload or if a shared database resource has become a bottleneck. Other potential causes of performance degradation include increased user activity, multiple large applications, or removal of hardware devices. If performance is poor only for brief periods, begin by looking for common applications or utilities that run at these times. If users report that a group of applications are experiencing performance issues, you can begin your analysis by focusing on these applications.

  3. Does the problem seem to be system-wide or isolated to WebSphere Service Registry and Repository (WSRR) and its applications?

    System-wide performance problems suggest an issue outside of WSRR. A problem at the operating system level might have to be investigated.

  4. If the problem is isolated to one application, does one particular query seem to be causing the problem?

    If one application seems to be causing the problem, you can evaluate whether users are reporting that a query or set of queries are experiencing a slowdown. You might be able to isolate the issue to one application and to a potential group of queries.

  5. Do you notice any common characteristics of the poor performance, or do the problems seem to be random?

    You might determine if any common objects (such as database tables, table space, or indexes) are involved. If so, this suggests that these objects are a point of contention. Other areas to potentially focus on are referential integrity constraints, foreign key cascades, and locking issues.

    For instructions on how to optimize a DB2 database, see the topic Optimizing the performance of your DB2 database.