Diagnosing the Integration Server
Introduction
This chapter contains information for the server administrator who troubleshoots the Integration Server or maintains diagnostic data from the server. Diagnostic data is the configurational, operational, and logging information from the Integration Server. This information is useful in situations where the server becomes unresponsive and unrecoverable.
To facilitate the troubleshooting process, the Integration Server provides the following features:
- Diagnostic port. A special port that uses a dedicated thread pool.
- Diagnostic utility. A special service that extracts important diagnostic data from the Integration Server.
- Safe mode switch. A method of starting the Integration Server in which the server does not connect to any external resource.
- Thread dump. A facility to generate a log containing information about currently running threads and processes within Java Virtual Machine (JVM), to help diagnose issues with Integration Server.
Configuring the Diagnostic Port
The diagnostic port is a special port that uses threads from a dedicated thread pool to process requests submitted via HTTP. It behaves like a typical HTTP port, except that the server uses the diagnostic thread pool instead of the server thread pool.
By maintaining a separate thread pool, this port improves the troubleshooting capability when the server becomes unresponsive. For example, when the server reaches its maximum number of threads, you cannot open the Integration Server Administrator. This prevents you from accessing information that might help you determine why the threads are not available. It also prevents you from freeing up other server resources. Using the threads from the diagnostic thread pool, the diagnostic port enables you to open the Integration Server Administrator.
When you install the Integration Server, it automatically creates the diagnostic port at 9999. If another port is running at 9999, the server will disable the diagnostic port when you start the Integration Server. To enable the diagnostic port, you must edit the port number. For instructions about how to edit port configurations, see Editing a Port. Only one diagnostic port can exist on each Integration Server.
Diagnostic Thread Pool Configuration
Through the Integration Server Administrator, you can configure the number of threads in the diagnostic thread pool. The server adds threads to the pool as needed until it reaches the maximum allowed. If the server reaches the maximum number, it waits until processes complete and returns threads to the pool before beginning more processes.
You can also set the thread priority for the diagnostic thread pool. The diagnostic thread priority determines the order of execution when the JVM receives requests from different threads. The larger the number, the higher the priority. When the JVM receives requests from different threads, it will run the thread with the higher priority. Therefore, by assigning a higher priority to the threads in the diagnostic thread pool, you can take advantage of the dedicated thread pool and improve access to the Integration Server Administrator.
For more information about how to configure the diagnostic thread pool, see Working with Extended Configuration Settings.
Diagnostic Port Access
Only users in the Administrators group can access the server through the diagnostic port. You can access the Integration Server Administrator via http://<hostname>:<diagnostic port> where hostname is the machine that hosts the Integration Server and diagnostic port is the diagnostic port number. After prompting you for a username and password, the server displays the Integration Server Administrator. Because you can access the diagnostic port only through HTTP, data and passwords will be passed clear=unencrypted.
The diagnostic port allows access only to services defined with the Administrators ACL. Software AG recommends that you do not change the default access settings.
Using the Diagnostic Utility
You use the diagnostic utility to collect configuration, operation, and logging data from the Integration Server. You can also use the diagnostic utility to view the list of fixes applied to the installed packages and Integration Server. The diagnostic utility is an Integration Server service called wm.server.admin:getDiagnosticData. It is accessible only by members of the Administrators group. Although you run the utility via the diagnostic port to troubleshoot, it can also be used with any HTTP or HTTPS port to collect diagnostic data periodically. You can access the service through the Get Diagnostics button in Integration Server Administrator.
The diagnostic utility creates a temporary diagnostics_hostname_port_yyyyMMddHHmmss.zip file in the Integration Server_directory \instances\instance_name\logs directory and writes to the .zip file as it collects information. It also contains a config\PackagesAndUpdates.txt file, which lists the packages and package updates for the Integration Server.
If there are problems creating the .zip file, such as insufficient space in the file system, it will return a text file. In the text file, the configuration and operation data are separated into distinct sections for easier reading. Unlike the .zip file, the text file does not contain logging data.
The .zip file contains a file in the config directory called PackagesAndUpdates.txt. This file lists the packages and package updates for the Integration Server.
Diagnostic Utility Performance
The diagnostic utility can execute slowly when logging large amounts of data from the Integration Server. To increase performance, you can set limits to the amount of data the diagnostic tool returns by specifying a maxLogSize value or setting the watt.server.diagnostic.logperiod parameter.
The
maxLogSize parameter
of the wm.server.admin:getDiagnosticData service sets the size limit for log
files written to the diagnostic_data.zip file. If a log file exceeds the
specified maxLogSize, the diagnostic utility omits it from the .zip file but
records it in a diagnosticwarning.txt file. This file lists all of the log
files that exceed the
maxLogSize value. It
is located in the logs directory of the .zip file.
maxLogSize parameter
only when running the diagnostic utility from a browser. You cannot limit the
log size when you run the diagnostic utility from
Integration Server
Administrator. For more information, see
Running the Diagnostic Utility Service.
Use the watt.server.diagnostic.logperiod parameter to specify the log period. By default, it is set to 6 hours. When this property is set to 0, the utility does not return any log files. It returns only the configuration and run-time data files.The logging information the utility returns depends on how you store the logs. If you save the logs to a database, the diagnostic utility will return the exact number of log entries that match the specified number of hours. If you save the logs to the file system, it will return not only the period within the specified number of hours but the entire log for that day. For instructions about how to set server configuration parameters, see Working with Extended Configuration Settings.
Use the watt.server.diagnostic.logFiles.maxMB parameter to specify the size limit for including audit log tiles in the diagnostic archive. While collecting each audit log file, Integration Server calculates the total size of the log files for the requested log period. If the total size of the log files for a particular audit log exceeds the value of watt.server.diagnostic.logFiles.maxMB for the log period, Integration Server does not include that audit log file in the diagnostic archive. Consider the examples below.
Example 1
- watt.server.diagnostic.logFiles.maxMB is 250 and watt.server.diagnostic.logperiod is 6.
- There are two WMSESSION log files that cover the previous six hours.
- The total size of the two WMSESSION log files is greater than 250 MB.
RESULT: No session audit log data will be included in the diagnostic data archive.
Example 2
- watt.server.diagnostic.logFiles.maxMB is 300 and watt.server.diagnostic.logperiod is 8
- There is one WMSERVICE log file that covers the previous eight hours.
- The size of the WMSERVICE log file is less than 300 MB.
RESULT: Service audit log data will be included in the diagnostic data archive.
Running the Diagnostic Utility from Integration Server Administrator
About this task
To run the diagnostic utility from Integration Server Administrator
Procedure
Running the Diagnostic Utility Service
About this task
maxLogSize parameter
to limit the size of the .zip file.
To run the diagnostic utility without using Integration Server Administrator
Procedure
Starting the Integration Server in Safe Mode
If Integration Server is having trouble starting because it or one of its packages cannot connect to an external resource, you can stop Integration Server and then start it in safe mode. When you start Integration Server in safe mode, it does not connect to any external resources, including databases. As a result, when Integration Server is in safe mode, it writes audit logging data associated with the IS Core Audit and Process Audit functions to flat files on the Integration Server instead of to an external database. In addition, when in safe mode, Integration Server loads only the WmRoot package; all other packages are inactive. When you restart Integration Server after you diagnose and correct the problem, Integration Server resumes audit logging for IS Core Audit and Process Audit functions to the external database and loads all enabled packages.
If Integration Server could not connect to a Broker or database, check the appropriate connection parameters and modify them as necessary. For instructions, see the webMethods Audit Logging Guide .
If a package such as Trading Networks Server or the webMethods SAP Adapter could not connect to an external resource, open Integration Server Administrator and go to the Packages > Management > Activate Inactive Packages page. In the Inactive Packages list, select the package and click Activate Package. Integration Server puts the package into the state it would have been in if you had started Integration Server normally. For example, if the package would have been enabled, Integration Server loads and enables it. Check and modify the connection parameters using the instructions in the appropriate guide.
Starting Integration Server in Safe Mode
About this task
To start Integration Server in safe mode
Procedure
When the Server Automatically Places You in Safe Mode
If the Integration Server detects a problem with the master password or outbound passwords at startup, it will automatically place you in safe mode. You will see the following message in the upper left corner of the Server Statistics screen of the Integration Server Administrator:
SERVER IS RUNNING IN SAFE MODE. Master password sanity check failed -- invalid
master password provided.
These problems can be caused by a corrupted master password file, a corrupted outbound password file, or by simply mis-typing the master password when you are prompted for it. If you suspect you have mis-typed the password, shut down the server and restart it, this time entering the correct password. If this does not correct the problem, refer to When Problems Exist with the Master Password or Outbound Passwords at Startup... for instructions.
Generating Thread Dumps
If Integration Server or a subsystem becomes slow or unresponsive, or users are unable to log into Integration Server, you can generate thread dumps to help you diagnose the problem. A thread dump can help you locate thread contention issues that can cause thread blocks or deadlocks.
You can generate thread dumps of the following:
- The JVM in which the Integration Server is running
- Individual threads running on Integration Server
Based on the information you obtain from these thread dumps, you might be able to correct the problem.
If you detect a problem with a thread that is associated with a user-written Java service or a flow service, you have the option of canceling or killing the thread.
When you cancel a thread, Integration Server frees up resources that are held by the thread and returns the thread to the thread pool. If Integration Server cannot cancel the thread, it gives you the option of killing the thread. When you kill a thread, Integration Server terminates the thread and adds a new one to the thread pool. For more information about canceling and killing service threads, see Canceling and Killing Threads Associated with a Service.
The following example describe how you might use the JVM thread dump and individual thread dumps to diagnose and fix problems.
| Scenario 1: A Service Is Running Longer than Expected |
|---|
|
| Scenario 2: The Server Is Unresponsive, Users Cannot Log In Through the Primary Port |
|---|
|
The following procedures show how to generate dumps for individual threads and for the JVM.
Generating a Dump of an Individual Thread
About this task
To view information about an individual thread
Procedure
Generating a Dump of the JVM
About this task
To generate a JVM thread dump