Level: Intermediate Hari H. Krishna (harkrish@in.ibm.com), Software Engineer, IBM
09 Aug 2005 The Generic
Log Adapter converts logs from their native,
product-specific log format to the Common Base Event format.
The process of conversion can be done either by using a
rules-based adapter, or by using a static adapter. Learn
how to choose the appropriate approach based on the
characteristics of the log entries. Compare your log format
to the samples here and get tips for creating the adapter
that is best suited to your case.
Introduction
The Generic Log Adapter (GLA) is a tool included in the
IBM Autonomic Computing Toolkit to convert data from
different data sources with many different
product-specific log formats to the IBM Common Base
Event format. The generated Common Base Events can be
consumed by various monitoring tools for further
analysis. (See Resources for a
link to download the Generic Log Adapter.)
The Generic Log Adapter provides different approaches to
perform the conversion to the Common Base Event format.
The GLA takes configuration files known as adapters as
input. There are two different types of adapters defined
in the GLA: rule-based adapters and static adapters.
Rule-based adapters use Regular Expressions to map
product-specific log formats to corresponding Common
Base Event fields; static adapters use Java™
classes to perform the log conversion.
Often it becomes difficult to decide on the type of
adapter to use. This article explains the different
scenarios to be considered while selecting the type of
adapter. To get the most from this article, you should
have a solid understanding of general autonomic
computing principles and working knowledge of the Log
and Trace Analyzer and the Generic Log Adapter.
Importance of log conversion with respect
to autonomic computing
Autonomic computing is a set of technologies and tools
that enables applications, systems, and entire networks
to become more self-managing. Self-management involves
four characteristics: self-configure, self-heal,
self-optimize, and self-protect, which are often
referred to as "self-CHOP" characteristics.
Developers use the phrase problem determination to
describe the process of finding the root cause of a
problem. This involves communication between different
autonomic components. Autonomic problem determination
is no different: the aim is keep information on what
went wrong so that the root cause can be detected.
The Common Base Event format is a logging and tracing
format that addresses the complexity (of communication
between autonomic components) of problem determination
in multicomponent system. Using the Common Base Event
format allows for the correlating of system status from
multiple data sources. Further information on the Common
Base Event can be found in Resources.
The GLA adapter file is typically an XML file that
contains a set of rules for extracting specific data.
IBM provides user interface (UI) support for developing
adapters in the Log and Trace Analyzer. An adapter file
contains different components:
- Context component
- The adapter configuration file consists of a number
of contexts. Each context in the adapter
configuration file consists of a series of
components that describe the conversion rules for
the associated log file. It also contains other
information needed to run the Generic Log Adapter
outputter class, sensor class, input log file, and
output information, such as the file name and the
directory where the file exists. Each context runs
as a separate thread independent of the other
contexts in the same adapter configuration file. A
typical structure of components can be seen in
Figure 1:
Figure 1. Context component view
- Sensor
- Defines the mechanism that reads the log content to
be processed.
- Extractor
- Provides a mechanism to receive the lines from the
sensor and separate the event messages. Simply put,
it defines the rules to recognize the message boundaries.
- Parser
- Defines a set of string mappings to convert the
message received from the extractor to Common Base
Event entries.
- Formatter
- Takes attributes and their values from the parser
and then creates the Common Base Event Java object instance.
- Outputter:
- Provides a way to wrap the formatted Java object
provided by the formatter in a form suitable for storing
Different approaches of converting a
product log to a Common Base Event
The main building block of problem determination is the
conversion of a product log from a product-specific log
format to a common logging format called a Common Base
Event format. This process of conversion using the
Generic Log Adapter can be done in two different ways: a
rules-based adapter approach and a static adapter
approach. The following section describes these
approaches in detail using some sample logs generated by
IBM products. For each sample log file, I include the
following information:
-
Description of the log format: This
section explains the log format and the
different fields that can be extracted from the
log that can be mapped to a Common Base Event
field. This section also provides an overview on
what can be the different approaches that can be
adopted for the log conversion. It also explains
pros and cons of different approaches and,
finally, recommends an optimal approach.
-
How to convert to a Common Base Event
format: This subsection gives tips on how to
implement the recommended approach by providing
code snippets for converting a few of the
complicated fields to Common Base Event fields.
Rules-based adapter approach
Often, log files can be converted from a specific log
format to the Common Base Event format using a
rules-based approach. For this method, the Generic Log
Adapter uses rules to parse the log messages. These
rules are typical Java regular expressions. Normally, a
specific pattern is written to extract a piece of data
from unstructured, raw log messages and events.
To determine whether this approach would work for a
particular log file, consider these suggested criteria:
- The log has a clear start or end pattern (or
both) for each record so that this can be
provided to the extractor component of the GLA
to extract individual records.
- Each and every field in the record that needs to
be mapped to a Common Base Event should be
either separated by a delimiter or should be in
the form of an easily extractable key value pair.
- If the log entries are not formatted according
to the "log Information" as mentioned
above, they should at least be extractable using rules.
Normally, logs generated by almost all products are in a
simple and readable format. These logs contain direct
information that can be converted into a Common Base
Event form without much complication. Sample logs
mentioned below in Listings 1 and 2 are good examples of
a simple log format log.
IBM HTTP Server access log sample
The sample log in Listing 1 is generated by an IBM HTTP
Server access log.
Listing 1. IBM HTTP Server
access log sample
9.26.157.44 - -
[13/Jan/2003:11:44:21 -0500] "GET /WSsamples
HTTP/1.1" 302 0 9.26.157.44 - -
[13/Jan/2003:11:44:21 -0500] "GET /WSsamples/
HTTP/1.1" 302 550 9.26.157.44 - -
[13/Jan/2003:11:44:21 -0500] "GET
/WSsamples/en/index.html HTTP/1.1" 200 1127
9.26.157.44 - - [13/Jan/2003:11:44:21 -0500] "GET
/WSsamples/en/Menu/Title.html HTTP/1.1" 200 1570
9.26.157.44 - - [13/Jan/2003:11:44:21 -0500] "GET
/WSsamples/en/Menu/SamplesIntro.html HTTP/1.1" 200
2966 9.26.157.44 - - [13/Jan/2003:11:44:21 -0500]
"GET /SamplesGallery/GalleryMenu HTTP/1.1" 200
5258 9.26.157.44 - - [13/Jan/2003:11:44:21 -0500]
"GET /WSsamples/en/SamplesMaster.css HTTP/1.1"
200 11600 |
Description of the log format: The sample log in
Listing 1 is an example of a log with each entry or log
record fully contained on one line. The log records
contain clear record separation, the fields in the log
records have clear demarcation (blank space character),
and they can be extracted using regular expressions. In
addition, each record has the timestamp when the record
was created (that is, the Common Base Event
"creationTime") and it is easily extractable.
In this case, because the log format is simple and the
entire log fields have clear demarcation, then both the
rule-based adapter approach and the static adapter
approach (explained in detail in later sections) are applicable.
The rules-based adapter would be the best solution for
this type of log. In other words, if the log entries are
simple and can be split using simple rules (regular
expressions), then the rules-based adapter should be
used to convert the logs from a proprietary form to a
Common Base Event form. It is always recommended to
first investigate using the rules-based approach because
it's simpler to understand and maintain. You can easily
see and understand the mapping between product-specific
log fields and Common Base Event properties while
building the rules, and it's simpler to modify according
to your requirements using the GLA Rule editor.
For example, consider a situation where a log field in
the log is mapped to a Common Base Event sub-element
called ExtendedDataElement,
and the end user wants it to change the mapping of the
respective field to a ContextDataElement. This type of
alteration is possible in rules-based adapters where, as
in case of Java parsers, the end user commonly does not
have access to the Java source files and, therefore,
cannot change them according to his requirements.
Another advantage of rules-based adapters is language
proficiency; for example, a user who is using the
adapter does not need to have any language-specific
skills to modify the mapping of elements, other than
regular expression skills.
How to convert to a Common Base Event format:
Every line in the sample log is contained in a single
log record, and all the fields in the record are
separated by a blank space. Because every record starts
with an IP address, the start pattern could be ^(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(\s+).
Because all the fields in a record are separated by a
blank space, it would be a good idea to have \s+ as the value for a separator
token. Other fields in the log can now be easily
extracted with simple rules. For example, to extract the
creation time from the record, you can use the rules
shown in Figure 2.
Figure 2. Rule to extract creation
time for IBM HTTP Server access log
Note: More information on the Separator Token can
be found in the article called "High-performance
rule writing for the Generic Log Adapter" (see Resources).
IBM WebSphere Application
Server activity log
The log sample in Listing 2 was taken from an IBM
WebSphere® Application Server log.
Listing 2. IBM WebSphere
Application Server activity log sample
---------------------------------------------------------------
ComponentId: Application Server ProcessId: 2676
ThreadId: 6a61d29f SourceId:
com.ibm.ws.management.connector.soap.JMXSoapAdapter
ClassName: MethodName: Manufacturer: IBM Product:
WebSphere Version: Platform 5.0 [BASE 5.0.0 s0245.03]
ServerName: ninjazx7\ninjazx7\server1\ TimeStamp:
2003-01-16 10:53:10.445000000 UnitOfWork: Severity: 3
Category: AUDIT PrimaryMessage: ADMC0013I: SOAP
connector available at port 8880 ExtendedMessage:
---------------------------------------------------------------
ComponentId: Application Server ProcessId: 2676
ThreadId: 6a61d29f SourceId:
com.ibm.ws.messaging.JMSEmbeddedProviderImpl ClassName:
MethodName: Manufacturer: IBM Product: WebSphere
Version: Platform 5.0 [BASE 5.0.0 s0245.03] ServerName:
ninjazx7\ninjazx7\server1 TimeStamp: 2003-01-16
10:53:11.286000000 UnitOfWork: Severity: 3 Category:
AUDIT PrimaryMessage: MSGS0050I: Starting the Queue
Manager ExtendedMessage:
-------------------------------------------------------------- |
Description of the log format: The sample log
mentioned in Listing 2 is an example of a multiline
record simple log format. In this case also, the log
records contain clear record separation. All the fields
in the sample log have clear demarcation and are logged
in the form of key:value pairs. Both the static adapter
approach and rules-based adapter approach are
applicable. But, considering maintainability issues and
ease of understanding, it would be better to implement a
rules-based adapter approach for this log also.
How to convert to a Common Base Event format: The
log displayed in Listing 2 is a multiline record sample.
The record can be separated by using -{15,} as the Start pattern. It
would be a good idea to put \s+ as the value for a separator
token and : as the value for
the designation token. Using a separator token makes it
easier to extract the other fields using the $h('') function. For example, you
can use the rule displayed in Figure 3 to retrieve the
creation time.
Figure 3. Rule to extract creation
time for IBM WebSphere Application Server activity log
You can also use custom Java class callouts with the
rules-based approach for converting unstructured data
and events to Common Base Events. This feature further
assists with customizing your regular expression rules
using a custom Java class, as described in the next section.
Using custom Java class callouts
When you combine the use of custom Java class callouts
with a rules-based approach, you can use regular
expressions together with Java code for situations where
regular expressions are not sufficient. One scenario is
to dynamically substitute a value for a Common Base
Event property.
Some products generate logs that contain only part of the
information that is needed to convert the log entries
into Common Base Events. Extra data is required to
complete the conversion. Take a look at the sample AIX Syslog.
AIX Syslog sample log
The code in Listing 3 is from an AIX® Syslog. This
data logged is simple and straightforward. The log
fields are separated by blank spaces.
Listing 3. AIX Syslog sample
May 2 15:51:15
dlfssrv syslogd: restart May 2 15:51:15 dlfssrv
syslogd: restart May 2 15:51:15 dlfssrv unix:
dlfs_mount entered.. May 2 15:51:15 dlfssrv unix:
check_stubvp entered ..... May 2 15:51:15 dlfssrv unix:
check_stubvp exited ..... May 2 15:51:15 dlfssrv unix:
dlfs_change_vfsops entered.... May 2 15:51:15 dlfssrv
unix: AMITA:1: dlfs_sync address:1D0FD0 May 2 15:51:15
dlfssrv unix: AMITA:2: dlfs_sync address:1D0FD0 May 2
15:51:15 dlfssrv unix: dlfs_change_vfsops exited.... |
Description of the log format: The log provided
in Listing 3 is also in a simple format. Log fields are
separated by a blank space, and the records are
contained on a single line. There is a clear demarcation
between each record. The rules-based adapter is
applicable in this case. However, the information
provided for the log creation time does not include a
year field. You can use either the static adapter
approach or the custom Java class callouts.
To use a static adapter you can add as the default the
current year of the machine that the log files are
residing on. The drawback is that you must code the
method of adding the year, and if you decide to change
the method, you must rebuild your entire static adapter
and the entire regression test that goes with it.
If you use custom Java class callouts, you can use
regular expressions together with Java classes. The
optimal solution is to write an extension class that is
capable of substituting the default system year. If you
decide to change the method, you need only to replace
the specific class; you would not have to change your
entire set of parsing rules. Using the custom Java
class callout method is the simpler solution in cases
where regular expressions are not sufficient, except for
a single field or two.
How to convert to a Common Base Event format: The
AIX Syslog log format is similar to that of the IBM HTTP
Server access log, and the rules-based adapter can be
implemented in a similar pattern to that of the IBM HTTP
Server access log. The only variation would be the way
CreationTime is being
extracted. Because the time provided in the AIX Syslog
is not complete, a Substation
Extension class can be developed for adding the
default year, as shown in Figure 4. More detailed
information on creating a substitution extension class
can be found in the article "Using Java class
callouts with the Generic Log Adapter" (see Resources for a link.)
Figure 4. Rule to extract the creation
time for AIX Syslog creation time
The Generic Log Adapter provides built-in functions to
facilitate the parsing process. Every Common Base Event
element can be assigned a default value using the
extended function called use built in function
option (more information about this function can be
found in the Eclipse product help documentation). This
approach is similar to using a custom Java class
callout. It allows a combination of multiple rules to
be specified for a Common Base Event property, some of
which may invoke a Java method and the others of which
use a regular expression.
Static adapter approach
The Generic Log Adapter extends its functions by
providing support to create a custom static adapter. It
does this by providing a set of interfaces. Using a
static adapter completely eliminates the use of regular
expressions. In other words, it is not possible to use
regular expressions for some properties and Java code
for other properties. In this approach, all the parsing
and the assigning of values to the properties of Common
Base Events must be handled through Java code.
z/OS component log and syslog
Now, let's take a look at a sample z/OS® component
log and z/OS syslog to see where a rules-based adapter
would not be feasible. The log displayed in Listing 4 is
a sample z/OS component log. Normally, z/OS component
logs start with a log format. In Listing 4, the sample
log for three different formats is displayed: FULL
FORMAT, SHORT FORMAT, and TALLY REPORT. These formats
correspond to the logging level (Detailed/Summary/Tally
Report Format and so on). The format is followed by
system name and creation time.
Listing 4. zOS component log sample
COMPONENT TRACE FULL
FORMAT SYSNAME(SY1)
COMP(SYSRSM) **** 06/11/2004
SYSNAME MNEMONIC ENTRY ID TIME STAMP
DESCRIPTION ------- -------- --------
--------------- -------------
SY1 TRACEB 00000023 06:45:47.435823 Trace
Buffer
FUNC1... TRACE Trace JOBN1... *MASTER*
ASID1... 0001 PLOCKS.. 00000000 CPU..... 0000
JOBN2... *MASTER* ASID2... 0001 RLOCKS.. 00000000
KEY..... 000F ADDR.... 022E0B28 ALET.... 00000000
07061000 00000076 00000000 00000000 00000000 00000000
00000000 022E0BF8 03000000 BB5A05F9 E832F44B
00000000 00000000 SY1 RSEPAG 00000008
06:55:42.761695 Enqueue Pageable Frame
FUNC1... VSMGTMN VSM Getmain Service
JOBN1... JES2MON ASID1... 001A PLOCKS.. 88004001
CPU..... 0000 JOBN2... JES2MON ASID2... 001A
RLOCKS.. 88004000 KEY..... 0036 ADDR....
02368408 ALET.... 00000000 1900
KEY..... 0001 ADDR.... 00183780 ALET.... 01000002
02274354 00047F80 81000000 07000000 0000001A 00000000
005E8000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 COMPONENT
TRACE SHORT FORMAT SYSNAME(SY1) COMP(SYSRSM) ****
06/11/2004
SYSNAME MNEMONIC ENTRY ID TIME STAMP
DESCRIPTION ------- -------- --------
--------------- -------------
SY1 TRACEB 00000023 06:45:47.435823 Trace
Buffer
SY1 RSEPAG 00000008 06:55:42.761695 Enqueue
Pageable Frame
COMPONENT TRACE TALLY REPORT SYSNAME(SY1) COMP(SYSRSM)
TRACE ENTRY COUNTS AND AVERAGE INTERVALS (IN
MICROSECONDS)
FMTID COUNT Interval MNEMONIC DESCRIBE
-------- ----------- ------------ --------
-------------------------------- 00000001 371
10,162 XEPENTRY External Entry Point Entry 00000002
372 10,134 XEPEXIT External Entry Point Exit
00000003 54 1,088 FIX Page Being
Fixed |
After the creation time, the next line of the log usually
contains a header line as shown below in Listing 5.
Listing 5. Sample header
structure zOS Component log
SYSNAME MNEMONIC
ENTRY ID TIME STAMP DESCRIPTION |
The header line is followed by values for respective
header fields as displayed in Listing 6.
Listing 6. Corresponding
value line for header structure shown in Listing 5
SY1 TRACEB 00000023
06:45:47.435823 Trace Buffer |
For these types of logs, it is not guaranteed that the
header field will remain constant. The location of
header may vary based on different factors. The next
line contains values corresponding to header line. The
values line is followed by a whole set of name-value
pairs (for example: FUNC1...
TRACE refers to FunctionName (FUNC1): Value
(TRACE), key-value pair). These pairs can exist in any
fashion.
There are approximately 20 different types of components
in z/OS. Each of the components generates a log in its
own customary form, with minor deviations from the
others. Other component trace records contain similar
types of name-value pairs, but with some different
names. The header line and its value may vary as well.
Description of the log format: The log records
have a clear demarcation. But, the elements in the
record can exist in any order; in other words, it's not
guaranteed that the elements would exist in a particular
order, as explained in log information section. Other
elements (key-value pairs) in the sample log, which must
be mapped to a Common Base Event field, are dynamic;
they can exist in any order. Therefore, this type of log
record format can be categorized as a complicated log
format. The order of elements in the log is based on the
corresponding header information. One of the header
structures is shown in Listing 5 and its corresponding
values are shown in Listing 6.
It is not feasible to write a rules-based approach. The
log record shown in Listing 4 is complicated and
contains different headers. The values to be assigned to
Common Base Event properties depend on headers. If all
the different header format combinations are known, it
would have been possible to use a rules-base adapter
approach in combination with custom Java class callouts.
But, because the headers and corresponding values are
dynamic, it's better to use a static adapter.
How to convert to a Common Base Event format:
This article does not cover explain normal operations
that need to be performed for adapter creation, such as
creating a class extending from monitoring an adapter
and initializing all the required elements for a Common
Base Event. For basic information on how to create a
static adapter, see the Eclipse product help
documentation. Apart from performing normal operations,
there are some complicated tasks, which are explained here.
To identify the type of format, scan the first line of
the record through a conditional loop and identify the
type of record. The code snippet for checking if the
current line is the start of the record, and identifying
and assigning the record type to the global record_type variable is shown in
Listing 7.
Listing 7. Code snippet
to check start of the header
/** * This
function is used to check if the current line is start
of the record and * also the type of record *
@return true if current line is the first line of a
header, false otherwise */ public boolean
reachedFirstLineOfHeader() { // SUMMARY_FORMAT
= "COMPONENT TRACE SUMMARY FORMAT"
if(currentLine.startsWith(SUMMARY_FORMAT))
{ { //Set the
global log_format variable to Summary format log
log_format = iSmmary_Format ; return
true ; } // SHORT_FORMAT = "COMPONENT
TRACE SHORT FORMAT" else
if(currentLine.startsWith(SHORT_FORMAT)) {
//Set the global log_format variable to Short format log
return true ; } // TALLY_REPORT =
"COMPONENT TRACE TALLY REPORT" else
if(currentLine.startsWith(TALLY_REPORT)) {
//Set the global log_format variable to Tally report
format return true ; }
return false ; } |
Identify the parsing header. After the record
type is known, the next line is followed by information
on the system name, the component name, and the creation
time, which can be extracted using normal string
manipulation. The next line is a header message (a
sample header line is displayed in Listing 5) this is
common for all the formats. The code snippet for parsing
the headers is shown in Listing 8.
Listing 8. Code snippet
to parse headers
/** * This function is
used to parse the headers * @param str this contains
the string to be parsed * @param sep this is the
separator token * @return, returns 1 in case of
success and < 1 in case of failure */ public
int parseHaeaders(String base, String hdrStr, String
sep) { try{ //function used to set the start
point for each field which is stored in //the
variable called m_hStartIndexHash
updateStartPoints(base) ; int headerCount = 0;
String headerString ; while(headerCount <
m_hStartIndexHash.size()) { headerString =
"" ; if(headerCount ==
(m_hStartIndexHash.size()-1) ) headerString =
hdrStr.substring(getValueAt(headerCount) ,
hdrStr.length() ); else headerString =
hdrStr.substring(getValueAt(headerCount),
getValueAt(headerCount+1)); //trimAll function
is used to trim all the special characters and blank
spaces headerString = trimAll(headerString,"
"); //once the header is extracted its
being stored in a variable called m_hHeaderHash
if(!(headerString.equals("")))
m_hHeaderHash.put(""+headerCount ,
headerString) ; headerCount++ ; }
}catch(Exception e) { return -1; }
return 1 ; } |
Though every field in the header has a clear separation
(blank space), you must determine the start point for
each field. For example, in the header displayed in
Listing 5, every header field is separated by blank
spaces. It does not mean that every blank space is
followed by a new header. For example, ENTRY ID and TIME STAMP are single record
fields, even though there is a blank space (field
separator) between the variables.
You can use the code snippet displayed in Listing 9 to
determine the start point for each field. The start
point values are stored in a global variable called
m_hStartIndexHash.
Listing 9. Code snippet
to extract start points
/** * This
function is used to determine start point for every
header field * @param base, string which contains
start points * @param hdrstr, string which contains
header names * @param sep, string which contains
separator * @return, returns 1 in case of success
and < 1 in case of failure */ public int
updateStartPoints(String base) { /* 10
20 30 47 * ------- --------
-------- --------------- ------------- * */ try {
int count = 0; int indexCount = 0; boolean stFlag =
true; while (count < base.length()) {
if ((base.charAt(count) == '-') && (stFlag))
{ m_hStartIndexHash.put("" +
indexCount, "" + count); stFlag =
false; indexCount++; }
else if (base.charAt(count) == ' ') {
stFlag = true; } count++; } } catch
(Exception e) { return -1 ; } return 1 ; } |
After the start points and headers are extracted from the
log record, the next step is to extract the
corresponding values from the value string. A sample
value line is displayed in Listing 6. Code for parsing
the values is displayed in Listing 10.
Listing 10. Code snippet
to parse values
/** * This function is
used to parse the values * @param str this contains
the string to be parsed * @return, returns 1 in case
of success and < 1 in case of failure */ public
int parseValues(String str) { try{ if
(m_hHeaderHash.size() < 1) return 1; if
(getValueAt(m_hHeaderHash.size()-1) > str.length())
return 1; }catch (Exception e) { return
1; } int valuesCount = 0; String
valueString ; while(valuesCount <
m_hStartIndexHash.size()) valueString = "" ;
if(valuesCount == (m_hStartIndexHash.size()-1) )
{ valueString = str.substring(getValueAt(valuesCount) ,
str.length() ); } else {
valueString = str.substring(getValueAt(valuesCount) ,
getValueAt(valuesCount+1)); } valueString
= trimAll(valueString , " ");
if(!(valueString.equals(""))) {
m_hValuesHash.put(""+valuesCount ,
valueString) ; } valuesCount++ ;
} }catch(Exception e) {
PrintOnConsole("CTRACE BACE 00003# "+e) ;
return 1; } return 1 ; } |
You also need code for assigning the creation time and
code for creating a Common Base Event variable with
default values assigned for required elements. In
addition, you need a while
loop for traversing the whole log.
In product logs that contain header lines and
corresponding value lines, it's possible to use a
rules-based adapter approach, if all the header element
sequences are known. For some product logs, it's
impossible to develop a rules-based adapter approach to
convert from a specific log format to a Common Base
Event format. The following product log is an example of
where a rules-based adapter approach cannot be used.
z/OS Syslog sample
The log displayed in Listing 11 is a sample log generated
by zOS Syslog. This sample log illustrates split
messages. Logs for other products, such as the IBM
DB2® diagnostic logs, use the split message form.
The record type in a z/OS Syslog can be identified by
the starting character of the record line. If the line
starts with a starting character "M," it
indicates the start of a multiline record. If a record
line starts with a character "D," it's an
indication of a continuation of some multiline record.
If a line starts with a character "E," it's an
indication of an end of a multiline record. The log
contains a lot of information that is not required in
this context. Only the format information required to
explain the present scenario is explained here. You can
ignore the rest.
Listing 11. zOS Syslog
sample
M 0000000 SY1 04098 08:39:29.74
00000290 IEA007I STATIC SYSTEM SYMBOL VALUES 060 D
060 00000290 &SYSNAME. = "SY1"
D 060 00000290
&SYSPLEX. = "LOCAL" D
060 00000290 &DUMPQUAL. = "ALPS"
M 4040000 SY1 04098 08:39:29.94 00000290
ILR032I PAGE DATA SET HAS BEEN USED BY ANOTHER SYSTEM:
062 D 062 00000290 DATA
SET NAME - SYS1.PLPA.PAGCOM D
060 00000290 &SYSNUM. = "1" E
060 00000290 &SYSR3. =
"ZDR16Y" D 062
00000290 SYSTEM NAME - SY1 D
062 00000290 VM USERID - SBAILEY E
062 00000290 DATA SET LAST UPDATED AT 06:02:59 ON
04/03/2004 (GMT) |
Description of the log format: The log displayed
in Listing 11 is a multiline record sample; a multiline
record can be identified with the start character
"M." A multiline record in z/OS Syslog starts
with a control line as the first line with the multiline
ID at the end of the text. (For example, the first line
displayed in Listing 11 with the multiline ID is 060.)
Subsequent lines have the same connect ID in the header.
The ID is needed to link the other lines with the
original message. This type of message format can be
termed a split message format. Split messages are
also indicated with a code in the first column of the
record, with the text associated with the prior record.
There are two split messages in this example.
"D" stands for "data line" and
indicates that it is part of a multiline message.
"E" stands for "end line" and
indicates that it is the last line of the message. Both
D and E line messages contain the connection IDs 060 and
062, which you can use to find out which multiline
message they belong to.
In the case of the z/OS sample log, the complete record
is not logged at once; the log record is available in
parts. While performing the log conversion from a
product-specific form to a Common Base Event form, it is
necessary to hold the record until its end part is
encountered. Because holding the record in memory is not
possible when using a rules-based adapter approach, the
only option is to use a static adapter approach.
How to convert to a Common Base Event format: The
implementation in this case involves some normal tasks,
such as creating the Common Base Event and assigning the
default values of the required elements. The complicated
task is the logic to identify the start of record, to
hold the records in memory, and to transform the
respective log records to a Common Base Event record
when the end of the record is encountered.
The code snippet displayed in Listing 12 depicts the
logic for identifying the type of record based on the
first character of the record. The code includes the
logic for further processing based on the line type. If
the first character of the line is "D," the
connection ID of the respective line is extracted, and
the line is appended to the corresponding line in
multiline hash, which stores all the multiline messages.
A starting character of "E" indicates the end
of the line and, therefore, the corresponding line with
the respective connection ID is extracted and sent for
further processing. Finally, a starting character
"M" indicates the start of a new multiline
message and, therefore, the respective line is stored in
a multiline hash with the connection ID as the key. The
following code shows how the messages are processed.
Listing 12. Code snippet
for processing messages
/** * This
function is used to process the log messages based on
the first character * of the log record. * @param
str this contains the string to be parsed */ public
void processMessage(String currentLine) { char
firstChar = currentLine.charAt(0) ; boolean newLineFlag
= false ; switch(firstChar) { case 'D': case
'E': case 'M': { // If character starts with any
of the above mentioned character // continue
processing further else return and start reading next
line. newlineFlag = true ; // check if the current
line is not null if(CompleteLine == null) {
CompleteLine = currentLine ; continue ; }
if(firstChar == 'D' || firstChar == 'E' ) { try{
/* * Method used to extract connection id from the
current line. * If the input of the function is the
line mentioned below then * the value of connection
id would be 060 *
--------------------------------------------------------
* D 060 00000290 &SYSNAME. =
"SY1" *
--------------------------------------------------------
*/ m_strconnID = getConnectionId(currentLine) ;
/* * m_oMultiLine_hash variables the complete
record as value with * corresponding connection
id as key. After the remaining part of * the
message is encountered it has to be appended to the
existing * original message for further
processing */ String mainMessageLine =
m_oMultiLine_Hash.get(m_strconnID); mainMessageLine
+= currentLine ; if (firstChar == 'E') { //
If the first char is E, it indicates end of the
multiline message // hence process the message and
added to Common Base Event array CommonBaseEvent
event = processCommon Base Event(currentLine) ;
} else { // Append the
current line message to existing multiline message line
// and update the m_oMultiLine_hash with most recent
message appendExistingLine(currentLine);
} messages[arrayIndex] = event;
m_oMultiLine_Hash.remove(m_strconnID); } }
}catch(Exception ex){} }else if(firstChar == 'M') {
/* * If first character is 'M' this indicates start
of a multiline message * Hence create a new key-value
pair with connection id as key and current * line as
value and added it to the object m_oMultiLine_Hash
object. The * logic for the same has to be
implemented in the below mentioned function */
createMultiLineMessage(CompleteLine) ; } } |
Conclusion
The conversion of a product log to a Common Base Event
format is possible in several different ways. Always
investigate using rules-based adapters first. Choose
static adapters when your log entries are unstructured
and have no simple detectible patterns, eye-catchers, or
delimiters to locate and extract data.
Resources Learn
Get products and technologies
Discuss
About the author  | 
|  | Hari H. Krishna is a Software Engineer at IBM focusing
on the Log and Trace Analyzer for Autonomic Computing.
He has more than five years of experience in the
development of log analysis and reporting based
applications. He holds a Masters degree in Computer
Science from Osmania University, Hyderabad, India. He
can be reached at harkrish@in.ibm.com. |
Rate this page
|