 | Level: Intermediate Sampath Chilukuri (sampath.kumar@in.ibm.com), Staff Software Engineer, IBM
08 Sep 2004 Updated 28 Oct 2004 Learn how to correlate log and trace files generated by different products in various formats. Correlating log files is the first step in the problem determination process. This article shows you the procedure for developing a custom correlation engine as a plug-in for the Log and Trace Analyzer (LTA). Using examples from the IBM® WebSphere® Application Server activity log and the IBM DB2® diagnostic log, you learn how the LTA can correlate the log records visually as a UML sequence diagram. (Note: Updated for Release 2 of the IBM Autonomic Computing Toolkit.) Introduction
The Log and Trace Analyzer (LTA) included in the IBM Autonomic Computing Toolkit is used for importing different logs generated by various products and transforming the log entries into the Common Base Event (CBE) format. The infrastructure for the LTA has been open source as part of the Eclipse Hyades project (see Resources for more information). The LTA can also import symptom databases. Log files can be analyzed and correlated against the symptom databases to find a solution for the problem. LTA is used primarily for problem determination because finding the cause of a problem becomes more difficult as the number of products, and the number of servers they run on, increases. The log file from a single product cannot always help in determining the solution for the overall system problem.
To help you understand the importance of correlation, consider the IBM WebSphere Application Server and an IBM DB2 database. These two products can work together as the application server to host the components and the database to store the data, respectively. If an error occurs in the database and, as a result, the application server stops, it is impossible to track down the source of the problem by looking only at the application server logs. The errors recorded in the application server logs might not be descriptive enough to indicate the details of the problem with the database. In this case, you also need to look at the logs generated by the database. You need to correlate the logs of the application server and the database so that the corresponding problem records from both of the logs can be identified. Although the CBE time stamp is precise up to the microsecond, watching the logs of the individual products and determining the problem by looking only at the time stamps becomes complex. Keep in mind that logs might be generated from different time zones, and the clocks on the systems running the application server and the database cannot always be synchronized to milliseconds.
Correlation in the Log and Trace Analyzer is finding the relation between the distributed log records and learning the influence of one log record on another. The log records can be from the same log file or from different log files; this relation between the log records can be based on the different properties or combination of the properties of the CBE. A correlation engine is an Eclipse plug-in of the LTA that shows the correlation between the log records visually in a UML sequence diagram.
This article describes the procedure for building a correlation engine for the LTA. This example correlation engine extends the default time correlation engine already available with the LTA. The existing default time correlation engine correlates log records by exactly matching the time stamp of the CBE events. However, there could be a time delay in milliseconds between the records of two products even though both the products are running on the same system. This correlation engine ignores the milliseconds while correlating the logs of the IBM WebSphere Application Server activity log and the IBM DB2 diagnostic log.
Prerequisites
You should have the LTA from the IBM Computing Toolkit Version 1.1 installed on your machine before proceeding (see Resources for more information). Familiarity with using the LTA to parse supported log files is assumed, and understanding the importance of correlation between log files is preferred. It is also assumed that you understand the Java programming language reasonably well. IBM WebSphere Application Server should be installed on the machine where the steps are to be executed. For an alternative, refer to the readme.txt in the zip file of this article. This correlation engine works with any two logs that need to be correlated based on a time stamp that ignores the milliseconds.
Overview of the correlation engine
With extention points the LTA can be extended to add new functions. An extension point is like a base class in object oriented programming languages, where the existing functionality of a class can be derived to a new class and new functionality can be added to have a custom behavior. Listing 1 and Listing 2 contain excerpts from the WebSphere Application Server activity log and DB2 diagnostic log that will be used to explain the correlation engine. Note that the sample records for the WebSphere Application Server activity log shown in Listing 1 are generated using the showlog.bat utility to view the activity log, which is in binary format. The showlog.bat utility can be found in the WebSphere Application Server installation folder (%WAS_HOME%\bin directory).
Listing 1. IBM WebSphere Application Server activity log
ComponentId: Application Server
ProcessId: 3352
ThreadId: 5d21ef0a
ThreadName:
SourceId: com.ibm.ws.rsadapter.DSConfigurationHelper
ClassName:
MethodName:
Manufacturer: IBM
Product: WebSphere
Version: Platform 5.1 [BASE 5.1.0 b0332.05]
ServerName: egl1\egl1\server1
TimeStamp: 2004-08-05 12:07:53.567000000
UnitOfWork:
Severity: 2
Category: WARNING
PrimaryMessage: DSRA8201W: DataSource Configuration: DSRA8041I:
Failed to connect to the DataSource. Encountered SQLException with SQL
State = 08001, Error Code = -30,082 : [IBM][CLI Driver] SQL30082N
Attempt to establish connection failed with security reason "24" ("USERNAME
AND/OR PASSWORD INVALID"). SQLSTATE=08001
.
COM.ibm.db2.jdbc.DB2Exception: [IBM][CLI Driver] SQL30082N Attempt to
establish connection failed with security reason "24"
("USERNAME AND/OR PASSWORD INVALID"). SQLSTATE=08001
at COM.ibm.db2.jdbc.app.SQLExceptionGenerator.throw_SQLException(Unknown Source)
at COM.ibm.db2.jdbc.app.SQLExceptionGenerator.check_return_code(Unknown Source)
at COM.ibm.db2.jdbc.app.DB2Connection.connect(Unknown Source)
…
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java(Compiled Code))
ExtendedMessage:
|
Listing 2. IBM DB2 diagnostic log
2004-08-05-12.07.53.133000 Instance:DB2 Node:000
PID:3028(db2bp.exe) TID:1804 Appid:none
oper system services sqlofica Probe:10
PID:3028 TID:1804 Node:000 Title: Not available!
5351 4c43 4120 2020 8800 0000 7e8a ffff SQLCA ....~Šÿÿ
2400 3234 ff55 5345 524e 414d 4520 414e $.24ÿUSERNAME AN
442f 4f52 2050 4153 5357 4f52 4420 494e D/OR PASSWORD IN
5641 4c49 44ff 2020 2020 2020 2020 2020 VALIDÿ
2020 2020 2020 2020 2020 2020 2020 2020
2020 2020 2020 2020 5351 4c45 5853 4d43 SQLEXSMC
2501 3780 2501 0000 0000 0000 0000 0000 %.7.%...........
0000 0000 0000 0000 2020 2020 2020 2020 ........
2020 2030 3830 3031 08001
|
The sample log contents shown in Listing 1 and Listing 2 are generated on a single machine so that you do not have to consider the time zone when correlating the log file. However, even though the logs are generated on the same machine, the log records are generated for the same error condition by different products, and the time stamp is not exactly the same. The log records with the same time stamp, without considering the milliseconds, are correlated. Also, correlation is not always performed between two or more logs. You can just import one log and correlate the log with itself.
Correlation engine extension point
To create a correlation engine, you must extend the org.eclipse.hyades.logc.logInteractionView extension point. This extension point is defined in the org.eclipse.hyades.logc plug-in. The two classes that are developed are MyCorEngine and SimpleParserFilter. MyCorEngine is the class that is responsible for providing the definition of the correlation logic and it needs to implement the ILogRecordCorrelationEngine interface. The SimpleParserFilter class is used to filter the records that are not required to be correlated. The filtered records do not appear in the final log interactions view. It is not mandatory that the log records be filtered; however, a class must be defined with no filtering logic. Filtering of the log records can be implemented based on various criteria such as filtering all records from appearing in the output between two time stamps or considering only the records with msgid starting with ADMR and so on. Because the definition of the filter class is mandatory, this article creates the SimpleParserFilter class with no logic for filtering of the records. However, it does define and implement the necessary methods for this interface, ILogRecordFilter, from the package org.eclipse.hyades.logc.extensions.
 |
Steps to create the LTA plug-in
You can create the correlation engine plug-in in two ways -- with or without the wizard. You can use the wizard that is available with the LTA and then edit the generated code to suit your requirements. This is the simpler way, but it keeps the plug-in creation process completely hidden. To use the wizard, start the LTA workbench by selecting Programs > IBM Autonomic Computing Toolkit > GLA-LTA > Editor from the Windows Start Menu.
Figure 1. LTA Workbench
Perform the following steps to create the correlation engine plug-in using the wizard:
- Select File > New > Example…. This opens the New Example window. Select the Hyades Logging from the left list box and Log Correlator Sample from the right list box and click Next.
Figure 2. New Example wizard
- Enter the new project name and select Finish or just select Finish to use the default project name LogCorrelatorProject.
Figure 3. Project name page of New Example wizard
The new project is created with two classes, MyCorEngine and SimpleParserFilter, along with the necessary entries in plugin.xml and plugin.properties files of the project.
Perform the following steps to create the correlation engine without using the wizard:
- Select File > New > Project. From the New Project window select Plug-in Development from the left list box and Plug-in project from the right list box. Click Next to continue.
Figure 4. New Project wizard
- Enter
LogCorrelatorProject as the project name. You can specify the folder where the project is created or click Next to continue with the default location for the project.
Figure 5. Plug-in project name page of the New Project wizard
- Enter the unique ID for the plug-in that is created or click Next to continue with the default values.
Figure 6. Plug-in project structure page of the New Project wizard
- Select Create a blank plug-in project and click Finish.
Figure 7. Plug-in code generators page of the New Project wizard
A new project is created in the LTA workbench.
- Open the project properties window by right clicking the LogCorrelatorProject and selecting Properties from the context menu.
- Add the following libraries to the project. From the libraries tab under the Java Build Path tree option of the project properties window, select Add External Jars and add the following JAR files from the LTA installation folder.
- <install_dir>\plugins\org.eclipse.emf.common\runtime\common.jar
- <install_dir>\plugins\org.eclipse.emf.ecore\runtime\ecore.jar
- <install_dir>\ plugins\org.eclipse.hyades.models.cbe\cbe-model.jar
- <install_dir>\plugins\org.eclipse.hyades.models.hierarchy\hmodel.jar
- <install_dir>\plugins\org.eclipse.hyades.logc\logc.jar
<install_dir> is the GLA-LTA bundle installation folder. It points to C:\Program Files\IBM\AutonomicComputingToolkit\GLA\dev\eclipse if you did not choose other folders during the bundle installation.
Figure 8. Properties for LogCorrelatorProject
- To add a manifest file to the project, right click the project and select New > File to open the New File window. Enter the file name as
plugin.properties in the File Name field and click Finish. Listing 3 shows the content that must be entered in the plugin.properties file:
Listing 3. Content of plugin.properties file:
# The following is a list of variables used for the 'custom.mycorengine' plugin
CORR_NAME = TimeStamp w/o millis Correlation Engine
CORR_DESC = Correlates log records using the time stamp field and
ignoring milli-seconds.
# This variable holds a listing of the log file that this correlation engine
# support.
# Multiple log files must be separated by a comma.
LOG_TYPES = IBM WebSphere Application Server activity log,
IBM DB2 Universal Database diagnostic log
LOG_TYPE_SIMPLE = IBM WebSphere Application Server activity log,
IBM DB2 Universal Database diagnostic log |
The manifest file lets you specify the name, description, and the types of logs the correlation engine supports for the window shown to the user during the selection of the correlation engine. The LOG_TYPE specifies the types of logs that this correlation engine supports; you can have any number of names; names are separated by comma and should all appear in a single line.
- Open the plugin.xml file that already exists in the project by double clicking it and replace its content in the source tab with the code showin in Listing 4.
Listing 4. Content of plugin.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<plugin
name="TimeStamp w/o millis Correlation Engine"
id="custom.mycorengine"
version="1.3.0"
provider-name="Eclipse.org">
<runtime>
<library name="corengine.jar">
</library>
</runtime>
<requires>
<import plugin="org.eclipse.core.runtime"/>
<import plugin="org.eclipse.emf.common"/>
<import plugin="org.eclipse.hyades.models.hierarchy"/>
<import plugin="org.eclipse.hyades.models.cbe"/>
<import plugin="org.eclipse.hyades.logc"/>
</requires>
<!-- ===================================================== -->
<!-- Contribute to the logInteractionView extension point. -->
<!-- ===================================================== -->
<extension
point="org.eclipse.hyades.logc.logInteractionView">
<view
log_types="%LOG_TYPES"
name="%CORR_NAME"
description="%CORR_DESC">
<LogRecordFilter
log_type="%LOG_TYPE_SIMPLE"
class="SimpleParserFilter">
</LogRecordFilter>
<LogRecordCorrelationEngine
class="MyCorEngine">
</LogRecordCorrelationEngine>
</view>
</extension>
</plugin> |
The class attribute of the LogRecordCorrelationEngine tag is used to specify the class that defines the correlation logic by implementing the ILogRecordCorrelationEngine interface. The class attribute of the LogRecordFilter tag specifies the class that defines the filter logic by implementing the ILogRecordFilter interface.
- Add the source file for the
MyCorEngine class by right clicking the src directory under the LogCorrelatorProject and selecting New > Class from the context menu. Enter MyCorEngine as the class name in the Name field of the New Java Class window and click Finish. Replace the content of the MyCorEngine.java file with the code shown in Listing 5.
Listing 5. Content of MyCorEngine.java file:
import org.eclipse.emf.common.util.BasicEList;
import org.eclipse.emf.common.util.EList;
import org.eclipse.hyades.logc.extensions.ILogRecordCorrelationEngine;
import org.eclipse.hyades.logs.correlators.RecordList;
import org.eclipse.hyades.models.cbe.CBECommonBaseEvent;
import org.eclipse.hyades.models.hierarchy.CorrelationContainer;
import org.eclipse.hyades.models.hierarchy.CorrelationContainerProxy;
import org.eclipse.hyades.models.hierarchy.CorrelationEngine;
/**
* A correlation engine to correlate log records of IBM WebSphere Application
* Server activity log and IBM DB2 Universal Database diagnostic log
*/
public class MyCorEngine implements ILogRecordCorrelationEngine {
private CorrelationEngine correlationEngine = null;
private CorrelationContainer correlationContainer = null;
/* The name and type of the simple correlation engine */
private final String CORRELATION_NAME = "TimeStamp w/o millis Correlation Engine";
private final String CORRELATION_TYPE = "Correlated";
/*
* This function makes the correlation between the log records.
*
* @param list - A listing of the processes (i.e. the log files) for which
* correlation are to be made made.
*/
public void correlate(CorrelationContainerProxy correlationContainerProxy,
EList logFiles. ICorrelationMonitor mon) {
correlationEngine = correlationContainerProxy.getCorrelationEngine();
correlationContainer = correlationContainerProxy.getCorrelationContainer();
if (correlationEngine != null) {
correlationEngine.setType(CORRELATION_TYPE);
correlationEngine.setName(CORRELATION_NAME);
correlationEngine.setId(CORRELATION_NAME);
}
/* Traverse through each of the log files that a correlation
* needs to be made */
for (int i = 0; i < logFiles.size(); i++) {
/* For each of the existing log file, traverse through its log records
* and make the necessary correlations */
if (logFiles.get(i) != null) {
/* Store the list corresponding to the log records of the i-th
* logFile */
EList recordList = ((RecordList) logFiles.get(i)).getList();
/* Make the necessary correlations */
makeCorrelations(recordList, logFiles, i);
}
} // End of for-loop
} // End of correlator (EList)
/*
* A helper method to 'correlate(EList)' that is used to make the necessary
* correlations between the log records
*
* @param recordList - A list of log records for which correlations are made
* logFiles - 'logRecord' belongs to a log file listed under this parameter
* logFileIndex - The index of 'logFiles' identifying the log file that
* logRecord belongs to
*/
private void makeCorrelations(EList recordList, EList logFiles,
int logFileIndex) {
/* Traverse through each of the log records and make the
* necessary correlations */
for (int j = 0; j < recordList.size(); j++) {
/* Make the correlation for the j-th log record */
setPartners(recordList.get(j), logFiles, logFileIndex);
} // End of for-loop
} // End of makeCorrelations (EList)
/*
* A helper method to 'makeCorrelations (EList, EList, int)' used to set
* the partners
*
* @param logRec - The log record for which partners are set
* logFiles - 'logRecord' belongs to a log file listed under this
* parameter
* logFileIndex - The index of 'logFiles' identifying the log file that
* logRecord belongs to
*/
private void setPartners(Object logRec, EList logFiles, int logFileIndex) {
/* The log records are mapped to a Common Base Event */
CBECommonBaseEvent logRecord = (CBECommonBaseEvent) logRec;
/* The list of records that will be checked */
EList recordList = null;
/* The correlators list and the correlated records of 'logRecord' */
EList correlators = null;
/* Get the record ID of the passed log record */
double recordCreationTime = getTimeWithoutMillis(logRecord);
/* Traverse through the logFiles (starting from index 'logFileIndex') and
* make the proper correlations*/
for (int i = logFileIndex; i < logFiles.size(); i++) {
/* The record list of the i-th logFile */
recordList = ((RecordList) logFiles.get(i)).getList();
/* Traverse through the record list of the i-th log file and make the
* proper correlations */
if (recordList != null) {
for (int j = 0; j < recordList.size(); j++) {
if (recordList.get(j) != null) {
if (recordList.get(j) != logRec) {
if(getTimeWithoutMillis(recordList.get(j))
== recordCreationTime) {
addCorrelation((CBECommonBaseEvent)
logRec, (CBECommonBaseEvent)
recordList.get(j));
}
} // End of the j-th loop
}
}
}
}// End of the i-th loop
} // End of setPartners (Object, EList, int)
private EList addCorrelation(CBECommonBaseEvent artifact,
CBECommonBaseEvent associatedEvent) {
EList correlations = (EList)
correlationContainer.getCorrelations().get(artifact);
if (correlations == null) {
correlations = new BasicEList();
correlations.add(associatedEvent);
correlationContainer.getCorrelations().put(artifact, correlations);
}
else {
correlations.add(associatedEvent);
}
return correlations;
}
/* Returns the creation time associated with a log record after removing the
* milli-seconds
*/
private double getTimeWithoutMillis(Object logRecord) {
double time = ((CBECommonBaseEvent) logRecord).getCreationTime();
/* cancel the micro-seconds as the CBE event can store the time up to
* micro-second precision
*/
long millis = ((long)time)/1000;
/*
* cancel the milli-seconds by dividing the milliseconds with 1000 and multiplying
* by 1000
*/
millis /= 1000;
millis *= 1000;
return millis;
}
} // End of MyCorEngine class
|
The MyCorEngine class implements the mandatory correlate(CorrelationContainerProxy correlationContainerProxy, EList logfiles) method. The attributes such as name, type, and so on, for the correlation engine can be set using the CorrelationContainerProxy object passed to the correlate method. Each element of the logFiles object is an EList object that holds the Common Base Events for the logs imported into the LTA. The helper methods are defined in the MyCorEngine class so that every record of one log is compared with every other record of the second log. The comparison is done by checking the equality of the time stamp that is retrieved from the Common Base Events using the helper method getTimeWithoutMillis. The correlation is to find the one-to-many relationship between the individual events of one log with all the events in all the other logs. This is depicted in Figure 9:
Figure 9. Correlation between the log records
The rectangles in the figure indicate the Common Base Events of the logs, and bi-directional arrows indicate the events that are compared to check if the time stamp is the same. The relationship between the events is also checked within the single log file. In this case, every event of the log file is compared with every other event in the same log file.
- As described in step 9, create a new class,
SimpleParserFilter, and replace the content of SimpleParserFilter.java with the code shown in Listing 6.
Listing 6. Content of MyCorEngine.java file
import org.eclipse.emf.common.util.EList;
import org.eclipse.hyades.logc.extensions.ILogRecordFilter;
/**
* The following class is a filtration for the Simple Parser log file.
*/
public class SimpleParserFilter implements ILogRecordFilter
{
/* Does not perform any filtration. It returns the list 'as is'. */
public EList filter(EList list) {
return list;
}
}
|
The EList object passed to the filter method of SimpleParserFilter holds the CommonBaseEvent records. Only the records that are returned by this method are considered for correlation. Therefore, you can define the logic for the filtration of the records in this method.
 |
Running the correlation engine
To run the correlation engine created in the previous section, switch to the Java perspective. To switch to the Java perspective, select Open Perspective > Other…. From the perspectives list, select Java and click OK.
From the Run menu select Run-as > Run-time Workbench to build and execute the projects existing in the workspace. Switch to the Profiling and Logging perspective in the new workbench that is opened and import the activity.log and db2diag.log as given in the following steps:
- Select File > Import to open the import window. Select Log File from the various types of sources that can be imported and click Next.
Figure 10. Import wizard
- Click Add to open the Add Log File window and select the log type IBM WebSphere Application Server activity log from the log files list.
Figure 11. Log files page of Import Log wizard
- From the Details tab, browse for the activity log generated by WebSphere Application Server by clicking the Browse button of the Activity log file path field.
- Similarly, browse for the path of the IBM WebSphere Application Server installation folder for the next field.
Figure 12. Add log file window
- Click OK to add the entry to the Selected log files: list.
- Follow steps 2 and 3 to add the IBM DB2 diagnostic log file to the Selected log files: list.
Figure 13. Log files page of Import Log wizard with logs selected
- Click Finish to start import of the logs.
- Select the Hosts from the Log Navigator bar, as shown in Figure 14.
Figure 14. Enabling hosts in profiling and logging view
- Right click the host name from the tree and select Open With > Log Interactions from the context menu.
Figure 15. Selecting Log Interactions
- Select the TimeStamp w/o millis Correlation Engine from the correlation schemas list and click OK to perform the correlation between all the imported logs.
Figure 16. Correlation schema selection window
 |
Understanding the Sequence diagram
Figure 17 shows the sequence diagram that is generated as a result of the correlation.
Figure 17. Sequence diagram
Every shaded rectangle represents a Common Base Event, and you can click on the shaded rectangles to observe the corresponding message of the event in the Properties view. You can also hover the cursor over the rectangles to see the message as tool tip. The arrows in the figure indicate that the correlation exists between the records of the logs as depicted in Figure 18, 19, and 20.
Figure 18. Correlation within log records
Figure 19. Correlation between two logs
Figure 20. Correlation with records out of the current page
Summary
Because a company's current infrastructure might host a different variety of products, and each product generates log and trace files to log activity and errors, the correlation of log files becomes the basic step in the process of problem determination. With the next release of the LTA more and more features will be added to make the process of problem determination easier and more user friendly.
Download | Name | Size | Download method |
|---|
| ac-correlatesource | | HTTP |
Resources
About the author  | 
|  | Sampath K Chilukuri is a Staff Software Engineer at IBM focusing on the Log and Trace Analyzer for Autonomic Computing. He has more than three years of experience in the development of applications that gather information from the log files generated by various products. He holds a bachelor's degree in Electronics and Communication Engineering from JNTU, Hyderabad, India. He can be reached at sampath.kumar@in.ibm.com
|
Rate this page
|  |