Skip to main content

If you don't have an IBM ID and password, register here.

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

Create a static adapter for use with the Generic Log Adapter

Choose when to use static versus rules-based adapters

Hari H. Krishna (harkrish@in.ibm.com), Software Engineer, IBM
Author photo
Hari H. Krishna is a Software Engineer at IBM focusing on the Log and Trace Analyzer for Autonomic Computing. He has more than five years of experience in the development of log analysis and reporting based applications. He holds a Masters degree in Computer Science from Osmania University, Hyderabad, India. He can be reached at harkrish@in.ibm.com.

Summary:  The Generic Log Adapter converts logs from their native, product-specific log format to the Common Base Event format. The process of conversion can be done either by using a rules-based adapter, or by using a static adapter. Learn how to choose the appropriate approach based on the characteristics of the log entries. Compare your log format to the samples here and get tips for creating the adapter that is best suited to your case.

Date:  09 Aug 2005
Level:  Intermediate

Comments:  

Introduction

The Generic Log Adapter (GLA) is a tool included in the IBM Autonomic Computing Toolkit to convert data from different data sources with many different product-specific log formats to the IBM Common Base Event format. The generated Common Base Events can be consumed by various monitoring tools for further analysis. (See Resources for a link to download the Generic Log Adapter.)

The Generic Log Adapter provides different approaches to perform the conversion to the Common Base Event format. The GLA takes configuration files known as adapters as input. There are two different types of adapters defined in the GLA: rule-based adapters and static adapters. Rule-based adapters use Regular Expressions to map product-specific log formats to corresponding Common Base Event fields; static adapters use Java™ classes to perform the log conversion.

Often it becomes difficult to decide on the type of adapter to use. This article explains the different scenarios to be considered while selecting the type of adapter. To get the most from this article, you should have a solid understanding of general autonomic computing principles and working knowledge of the Log and Trace Analyzer and the Generic Log Adapter.


Importance of log conversion with respect to autonomic computing

Autonomic computing is a set of technologies and tools that enables applications, systems, and entire networks to become more self-managing. Self-management involves four characteristics: self-configure, self-heal, self-optimize, and self-protect, which are often referred to as "self-CHOP" characteristics.

Developers use the phrase problem determination to describe the process of finding the root cause of a problem. This involves communication between different autonomic components. Autonomic problem determination is no different: the aim is keep information on what went wrong so that the root cause can be detected.

The Common Base Event format is a logging and tracing format that addresses the complexity (of communication between autonomic components) of problem determination in multicomponent system. Using the Common Base Event format allows for the correlating of system status from multiple data sources. Further information on the Common Base Event can be found in Resources.

The GLA adapter file is typically an XML file that contains a set of rules for extracting specific data. IBM provides user interface (UI) support for developing adapters in the Log and Trace Analyzer. An adapter file contains different components:

Context component
The adapter configuration file consists of a number of contexts. Each context in the adapter configuration file consists of a series of components that describe the conversion rules for the associated log file. It also contains other information needed to run the Generic Log Adapter outputter class, sensor class, input log file, and output information, such as the file name and the directory where the file exists. Each context runs as a separate thread independent of the other contexts in the same adapter configuration file. A typical structure of components can be seen in Figure 1:

Figure 1. Context component view
The JavaBeans view
Sensor
Defines the mechanism that reads the log content to be processed.
Extractor
Provides a mechanism to receive the lines from the sensor and separate the event messages. Simply put, it defines the rules to recognize the message boundaries.
Parser
Defines a set of string mappings to convert the message received from the extractor to Common Base Event entries.
Formatter
Takes attributes and their values from the parser and then creates the Common Base Event Java object instance.
Outputter:
Provides a way to wrap the formatted Java object provided by the formatter in a form suitable for storing

Different approaches of converting a product log to a Common Base Event

The main building block of problem determination is the conversion of a product log from a product-specific log format to a common logging format called a Common Base Event format. This process of conversion using the Generic Log Adapter can be done in two different ways: a rules-based adapter approach and a static adapter approach. The following section describes these approaches in detail using some sample logs generated by IBM products. For each sample log file, I include the following information:

  • Description of the log format: This section explains the log format and the different fields that can be extracted from the log that can be mapped to a Common Base Event field. This section also provides an overview on what can be the different approaches that can be adopted for the log conversion. It also explains pros and cons of different approaches and, finally, recommends an optimal approach.
  • How to convert to a Common Base Event format: This subsection gives tips on how to implement the recommended approach by providing code snippets for converting a few of the complicated fields to Common Base Event fields.


Rules-based adapter approach

Often, log files can be converted from a specific log format to the Common Base Event format using a rules-based approach. For this method, the Generic Log Adapter uses rules to parse the log messages. These rules are typical Java regular expressions. Normally, a specific pattern is written to extract a piece of data from unstructured, raw log messages and events.

To determine whether this approach would work for a particular log file, consider these suggested criteria:

  1. The log has a clear start or end pattern (or both) for each record so that this can be provided to the extractor component of the GLA to extract individual records.
  2. Each and every field in the record that needs to be mapped to a Common Base Event should be either separated by a delimiter or should be in the form of an easily extractable key value pair.
  3. If the log entries are not formatted according to the "log Information" as mentioned above, they should at least be extractable using rules.

Normally, logs generated by almost all products are in a simple and readable format. These logs contain direct information that can be converted into a Common Base Event form without much complication. Sample logs mentioned below in Listings 1 and 2 are good examples of a simple log format log.

IBM HTTP Server access log sample

The sample log in Listing 1 is generated by an IBM HTTP Server access log.


Listing 1. IBM HTTP Server access log sample
         9.26.157.44 - -
                [13/Jan/2003:11:44:21 -0500] "GET /WSsamples
                HTTP/1.1" 302 0 9.26.157.44 - -
                [13/Jan/2003:11:44:21 -0500] "GET /WSsamples/
                HTTP/1.1" 302 550 9.26.157.44 - -
                [13/Jan/2003:11:44:21 -0500] "GET
                /WSsamples/en/index.html HTTP/1.1" 200 1127
                9.26.157.44 - - [13/Jan/2003:11:44:21 -0500] "GET
                /WSsamples/en/Menu/Title.html HTTP/1.1" 200 1570
                9.26.157.44 - - [13/Jan/2003:11:44:21 -0500] "GET
                /WSsamples/en/Menu/SamplesIntro.html HTTP/1.1" 200
                2966 9.26.157.44 - - [13/Jan/2003:11:44:21 -0500]
                "GET /SamplesGallery/GalleryMenu HTTP/1.1" 200
                5258 9.26.157.44 - - [13/Jan/2003:11:44:21 -0500]
                "GET /WSsamples/en/SamplesMaster.css HTTP/1.1"
                200 11600     

Description of the log format: The sample log in Listing 1 is an example of a log with each entry or log record fully contained on one line. The log records contain clear record separation, the fields in the log records have clear demarcation (blank space character), and they can be extracted using regular expressions. In addition, each record has the timestamp when the record was created (that is, the Common Base Event "creationTime") and it is easily extractable. In this case, because the log format is simple and the entire log fields have clear demarcation, then both the rule-based adapter approach and the static adapter approach (explained in detail in later sections) are applicable.

The rules-based adapter would be the best solution for this type of log. In other words, if the log entries are simple and can be split using simple rules (regular expressions), then the rules-based adapter should be used to convert the logs from a proprietary form to a Common Base Event form. It is always recommended to first investigate using the rules-based approach because it's simpler to understand and maintain. You can easily see and understand the mapping between product-specific log fields and Common Base Event properties while building the rules, and it's simpler to modify according to your requirements using the GLA Rule editor.

For example, consider a situation where a log field in the log is mapped to a Common Base Event sub-element called ExtendedDataElement, and the end user wants it to change the mapping of the respective field to a ContextDataElement. This type of alteration is possible in rules-based adapters where, as in case of Java parsers, the end user commonly does not have access to the Java source files and, therefore, cannot change them according to his requirements. Another advantage of rules-based adapters is language proficiency; for example, a user who is using the adapter does not need to have any language-specific skills to modify the mapping of elements, other than regular expression skills.

How to convert to a Common Base Event format: Every line in the sample log is contained in a single log record, and all the fields in the record are separated by a blank space. Because every record starts with an IP address, the start pattern could be ^(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})|(\s+).

Because all the fields in a record are separated by a blank space, it would be a good idea to have \s+ as the value for a separator token. Other fields in the log can now be easily extracted with simple rules. For example, to extract the creation time from the record, you can use the rules shown in Figure 2.


Figure 2. Rule to extract creation time for IBM HTTP Server access log
XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width.

Note: More information on the Separator Token can be found in the article called "High-performance rule writing for the Generic Log Adapter" (see Resources).

IBM WebSphere Application Server activity log

The log sample in Listing 2 was taken from an IBM WebSphere® Application Server log.


Listing 2. IBM WebSphere Application Server activity log sample
                ---------------------------------------------------------------
                ComponentId:  Application Server ProcessId:  2676
                ThreadId:  6a61d29f SourceId:
                com.ibm.ws.management.connector.soap.JMXSoapAdapter
                ClassName: MethodName: Manufacturer:  IBM Product:
                WebSphere Version:  Platform 5.0 [BASE 5.0.0 s0245.03]
                ServerName:  ninjazx7\ninjazx7\server1\ TimeStamp:
                2003-01-16 10:53:10.445000000 UnitOfWork: Severity:  3
                Category:  AUDIT PrimaryMessage:  ADMC0013I: SOAP
                connector available at port 8880 ExtendedMessage:
                ---------------------------------------------------------------
                ComponentId:  Application Server ProcessId:  2676
                ThreadId:  6a61d29f SourceId:
                com.ibm.ws.messaging.JMSEmbeddedProviderImpl ClassName:
                MethodName: Manufacturer:  IBM Product:  WebSphere
                Version:  Platform 5.0 [BASE 5.0.0 s0245.03] ServerName:
                ninjazx7\ninjazx7\server1 TimeStamp:  2003-01-16
                10:53:11.286000000 UnitOfWork: Severity:  3 Category:
                AUDIT PrimaryMessage:  MSGS0050I: Starting the Queue
                Manager ExtendedMessage:
                --------------------------------------------------------------  

Description of the log format: The sample log mentioned in Listing 2 is an example of a multiline record simple log format. In this case also, the log records contain clear record separation. All the fields in the sample log have clear demarcation and are logged in the form of key:value pairs. Both the static adapter approach and rules-based adapter approach are applicable. But, considering maintainability issues and ease of understanding, it would be better to implement a rules-based adapter approach for this log also.

How to convert to a Common Base Event format: The log displayed in Listing 2 is a multiline record sample. The record can be separated by using -{15,} as the Start pattern. It would be a good idea to put \s+ as the value for a separator token and : as the value for the designation token. Using a separator token makes it easier to extract the other fields using the $h('') function. For example, you can use the rule displayed in Figure 3 to retrieve the creation time.


Figure 3. Rule to extract creation time for IBM WebSphere Application Server activity log
XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width.

You can also use custom Java class callouts with the rules-based approach for converting unstructured data and events to Common Base Events. This feature further assists with customizing your regular expression rules using a custom Java class, as described in the next section.

Using custom Java class callouts

When you combine the use of custom Java class callouts with a rules-based approach, you can use regular expressions together with Java code for situations where regular expressions are not sufficient. One scenario is to dynamically substitute a value for a Common Base Event property.

Some products generate logs that contain only part of the information that is needed to convert the log entries into Common Base Events. Extra data is required to complete the conversion. Take a look at the sample AIX Syslog.

AIX Syslog sample log

The code in Listing 3 is from an AIX® Syslog. This data logged is simple and straightforward. The log fields are separated by blank spaces.


Listing 3. AIX Syslog sample
          May  2 15:51:15
                dlfssrv syslogd: restart May  2 15:51:15 dlfssrv
                syslogd: restart May  2 15:51:15 dlfssrv unix:
                dlfs_mount entered.. May  2 15:51:15 dlfssrv unix:
                check_stubvp entered ..... May  2 15:51:15 dlfssrv unix:
                check_stubvp exited ..... May  2 15:51:15 dlfssrv unix:
                dlfs_change_vfsops entered.... May  2 15:51:15 dlfssrv
                unix: AMITA:1: dlfs_sync address:1D0FD0 May  2 15:51:15
                dlfssrv unix: AMITA:2: dlfs_sync address:1D0FD0 May  2
                15:51:15 dlfssrv unix: dlfs_change_vfsops exited....     

Description of the log format: The log provided in Listing 3 is also in a simple format. Log fields are separated by a blank space, and the records are contained on a single line. There is a clear demarcation between each record. The rules-based adapter is applicable in this case. However, the information provided for the log creation time does not include a year field. You can use either the static adapter approach or the custom Java class callouts.

To use a static adapter you can add as the default the current year of the machine that the log files are residing on. The drawback is that you must code the method of adding the year, and if you decide to change the method, you must rebuild your entire static adapter and the entire regression test that goes with it.

If you use custom Java class callouts, you can use regular expressions together with Java classes. The optimal solution is to write an extension class that is capable of substituting the default system year. If you decide to change the method, you need only to replace the specific class; you would not have to change your entire set of parsing rules. Using the custom Java class callout method is the simpler solution in cases where regular expressions are not sufficient, except for a single field or two.

How to convert to a Common Base Event format: The AIX Syslog log format is similar to that of the IBM HTTP Server access log, and the rules-based adapter can be implemented in a similar pattern to that of the IBM HTTP Server access log. The only variation would be the way CreationTime is being extracted. Because the time provided in the AIX Syslog is not complete, a Substation Extension class can be developed for adding the default year, as shown in Figure 4. More detailed information on creating a substitution extension class can be found in the article "Using Java class callouts with the Generic Log Adapter" (see Resources for a link.)


Figure 4. Rule to extract the creation time for AIX Syslog creation time
XML error: The image is not displayed because the width is greater than the maximum of 580 pixels. Please decrease the image width.

The Generic Log Adapter provides built-in functions to facilitate the parsing process. Every Common Base Event element can be assigned a default value using the extended function called use built in function option (more information about this function can be found in the Eclipse product help documentation). This approach is similar to using a custom Java class callout. It allows a combination of multiple rules to be specified for a Common Base Event property, some of which may invoke a Java method and the others of which use a regular expression.


Static adapter approach

The Generic Log Adapter extends its functions by providing support to create a custom static adapter. It does this by providing a set of interfaces. Using a static adapter completely eliminates the use of regular expressions. In other words, it is not possible to use regular expressions for some properties and Java code for other properties. In this approach, all the parsing and the assigning of values to the properties of Common Base Events must be handled through Java code.

z/OS component log and syslog

Now, let's take a look at a sample z/OS® component log and z/OS syslog to see where a rules-based adapter would not be feasible. The log displayed in Listing 4 is a sample z/OS component log. Normally, z/OS component logs start with a log format. In Listing 4, the sample log for three different formats is displayed: FULL FORMAT, SHORT FORMAT, and TALLY REPORT. These formats correspond to the logging level (Detailed/Summary/Tally Report Format and so on). The format is followed by system name and creation time.


Listing 4. zOS component log sample
     COMPONENT TRACE FULL
                FORMAT SYSNAME(SY1)
                COMP(SYSRSM) **** 06/11/2004
                SYSNAME   MNEMONIC  ENTRY ID    TIME STAMP
                DESCRIPTION -------   --------  --------
                ---------------  -------------
                SY1       TRACEB    00000023  06:45:47.435823  Trace
                Buffer
                FUNC1... TRACE             Trace      JOBN1... *MASTER*
                ASID1... 0001     PLOCKS.. 00000000 CPU..... 0000
                JOBN2... *MASTER* ASID2... 0001     RLOCKS.. 00000000
                KEY..... 000F     ADDR.... 022E0B28 ALET.... 00000000
                07061000 00000076 00000000 00000000 00000000 00000000
                00000000 022E0BF8      03000000 BB5A05F9 E832F44B
                00000000 00000000 SY1       RSEPAG    00000008
                06:55:42.761695  Enqueue Pageable Frame
                FUNC1... VSMGTMN           VSM Getmain Service
                JOBN1... JES2MON  ASID1... 001A     PLOCKS.. 88004001
                CPU..... 0000      JOBN2... JES2MON  ASID2... 001A
                RLOCKS.. 88004000      KEY..... 0036     ADDR....
                02368408 ALET.... 00000000      1900
                KEY..... 0001     ADDR.... 00183780 ALET.... 01000002
                02274354 00047F80 81000000 07000000 0000001A 00000000
                005E8000 00000000      00000000 00000000 00000000
                00000000 00000000 00000000 00000000 00000000  COMPONENT
                TRACE SHORT FORMAT SYSNAME(SY1) COMP(SYSRSM) ****
                06/11/2004
                SYSNAME   MNEMONIC  ENTRY ID    TIME STAMP
                DESCRIPTION -------   --------  --------
                ---------------  -------------
                SY1       TRACEB    00000023  06:45:47.435823  Trace
                Buffer
                SY1       RSEPAG    00000008  06:55:42.761695  Enqueue
                Pageable Frame
                COMPONENT TRACE TALLY REPORT SYSNAME(SY1) COMP(SYSRSM)
                TRACE ENTRY COUNTS AND AVERAGE INTERVALS (IN
                MICROSECONDS)
                FMTID    COUNT       Interval     MNEMONIC DESCRIBE
                -------- ----------- ------------ --------
                -------------------------------- 00000001         371
                10,162 XEPENTRY External Entry Point Entry 00000002
                372       10,134 XEPEXIT  External Entry Point Exit
                00000003          54        1,088 FIX      Page Being
                Fixed  

After the creation time, the next line of the log usually contains a header line as shown below in Listing 5.


Listing 5. Sample header structure zOS Component log
        SYSNAME   MNEMONIC
                ENTRY ID    TIME STAMP     DESCRIPTION  

The header line is followed by values for respective header fields as displayed in Listing 6.


Listing 6. Corresponding value line for header structure shown in Listing 5
 SY1       TRACEB    00000023
                06:45:47.435823  Trace Buffer  

For these types of logs, it is not guaranteed that the header field will remain constant. The location of header may vary based on different factors. The next line contains values corresponding to header line. The values line is followed by a whole set of name-value pairs (for example: FUNC1... TRACE refers to FunctionName (FUNC1): Value (TRACE), key-value pair). These pairs can exist in any fashion.

There are approximately 20 different types of components in z/OS. Each of the components generates a log in its own customary form, with minor deviations from the others. Other component trace records contain similar types of name-value pairs, but with some different names. The header line and its value may vary as well.

Description of the log format: The log records have a clear demarcation. But, the elements in the record can exist in any order; in other words, it's not guaranteed that the elements would exist in a particular order, as explained in log information section. Other elements (key-value pairs) in the sample log, which must be mapped to a Common Base Event field, are dynamic; they can exist in any order. Therefore, this type of log record format can be categorized as a complicated log format. The order of elements in the log is based on the corresponding header information. One of the header structures is shown in Listing 5 and its corresponding values are shown in Listing 6.

It is not feasible to write a rules-based approach. The log record shown in Listing 4 is complicated and contains different headers. The values to be assigned to Common Base Event properties depend on headers. If all the different header format combinations are known, it would have been possible to use a rules-base adapter approach in combination with custom Java class callouts. But, because the headers and corresponding values are dynamic, it's better to use a static adapter.

How to convert to a Common Base Event format: This article does not cover explain normal operations that need to be performed for adapter creation, such as creating a class extending from monitoring an adapter and initializing all the required elements for a Common Base Event. For basic information on how to create a static adapter, see the Eclipse product help documentation. Apart from performing normal operations, there are some complicated tasks, which are explained here.

To identify the type of format, scan the first line of the record through a conditional loop and identify the type of record. The code snippet for checking if the current line is the start of the record, and identifying and assigning the record type to the global record_type variable is shown in Listing 7.


Listing 7. Code snippet to check start of the header
 /**   * This
                function is used to check if the current line is start
                of the record and   * also the type of record   *
                @return true if current line is the first line of a
                header, false otherwise   */   public boolean
                reachedFirstLineOfHeader()  {         // SUMMARY_FORMAT
                = "COMPONENT TRACE SUMMARY FORMAT"
                if(currentLine.startsWith(SUMMARY_FORMAT))
                {                                   {      //Set the
                global log_format variable to Summary format log
                log_format = iSmmary_Format ;                 return
                true ;       }         // SHORT_FORMAT = "COMPONENT
                TRACE SHORT FORMAT"   else
                if(currentLine.startsWith(SHORT_FORMAT))              {
                //Set the global log_format variable to Short format log
                return true ;       }         // TALLY_REPORT =
                "COMPONENT TRACE TALLY REPORT"   else
                if(currentLine.startsWith(TALLY_REPORT))       {
                //Set the global log_format variable to Tally report
                format                     return true ;       }
                return false ; }   

Identify the parsing header. After the record type is known, the next line is followed by information on the system name, the component name, and the creation time, which can be extracted using normal string manipulation. The next line is a header message (a sample header line is displayed in Listing 5) this is common for all the formats. The code snippet for parsing the headers is shown in Listing 8.


Listing 8. Code snippet to parse headers
 /**    * This function is
                used to parse the headers    * @param str this contains
                the string to be parsed    * @param sep this is the
                separator token    * @return, returns 1 in case of
                success and < 1 in case of failure    */     public
                int parseHaeaders(String base, String hdrStr, String
                sep) {  try{       //function used to set the start
                point for each field which is stored in        //the
                variable called m_hStartIndexHash
                updateStartPoints(base) ;     int headerCount = 0;
                String headerString ;  while(headerCount <
                m_hStartIndexHash.size())  {     headerString =
                "" ;     if(headerCount ==
                (m_hStartIndexHash.size()-1) )   headerString =
                hdrStr.substring(getValueAt(headerCount) ,
                hdrStr.length() );       else             headerString =
                hdrStr.substring(getValueAt(headerCount),
                getValueAt(headerCount+1));           //trimAll function
                is used to trim all the special characters and blank
                spaces      headerString = trimAll(headerString,"
                ");            //once the header is extracted its
                being stored in a variable called m_hHeaderHash
                if(!(headerString.equals("")))
                m_hHeaderHash.put(""+headerCount ,
                headerString) ;      headerCount++ ;  }
                }catch(Exception e)      {   return -1;      }
                return 1 ; }    

Though every field in the header has a clear separation (blank space), you must determine the start point for each field. For example, in the header displayed in Listing 5, every header field is separated by blank spaces. It does not mean that every blank space is followed by a new header. For example, ENTRY ID and TIME STAMP are single record fields, even though there is a blank space (field separator) between the variables.

You can use the code snippet displayed in Listing 9 to determine the start point for each field. The start point values are stored in a global variable called m_hStartIndexHash.


Listing 9. Code snippet to extract start points
  /**     * This
                function is used to determine start point for every
                header field     * @param base, string which contains
                start points     * @param hdrstr, string which contains
                header names     * @param sep, string which contains
                separator     * @return, returns 1 in case of success
                and < 1 in case of failure     */   public int
                updateStartPoints(String base) {  /*          10
                20        30               47   * -------   --------
                --------  ---------------  -------------   * */    try {
                int count = 0;   int indexCount = 0;   boolean stFlag =
                true;   while (count < base.length())             {
                if ((base.charAt(count) == '-') && (stFlag))
                {           m_hStartIndexHash.put("" +
                indexCount, "" + count);           stFlag =
                false;           indexCount++;          }
                else if (base.charAt(count) == ' ')                    {
                stFlag = true;          }       count++;   }   } catch
                (Exception e)   {    return -1 ;   }    return 1 ; }    

After the start points and headers are extracted from the log record, the next step is to extract the corresponding values from the value string. A sample value line is displayed in Listing 6. Code for parsing the values is displayed in Listing 10.


Listing 10. Code snippet to parse values
 /**    * This function is
                used to parse the values    * @param str this contains
                the string to be parsed    * @return, returns 1 in case
                of success and < 1 in case of failure    */   public
                int parseValues(String str) {    try{         if
                (m_hHeaderHash.size() < 1)    return 1;         if
                (getValueAt(m_hHeaderHash.size()-1) > str.length())
                return 1;    }catch (Exception e)            {   return
                1;         }         int valuesCount = 0;         String
                valueString ;         while(valuesCount <
                m_hStartIndexHash.size())   valueString = "" ;
                if(valuesCount == (m_hStartIndexHash.size()-1) )
                {  valueString = str.substring(getValueAt(valuesCount) ,
                str.length() );              }  else              {
                valueString = str.substring(getValueAt(valuesCount) ,
                getValueAt(valuesCount+1));              }   valueString
                = trimAll(valueString , " ");
                if(!(valueString.equals("")))                {
                m_hValuesHash.put(""+valuesCount ,
                valueString) ;                }   valuesCount++ ;
                }          }catch(Exception e)          {
                PrintOnConsole("CTRACE BACE 00003# "+e) ;
                return 1;          }         return 1 ;        }   

You also need code for assigning the creation time and code for creating a Common Base Event variable with default values assigned for required elements. In addition, you need a while loop for traversing the whole log.

In product logs that contain header lines and corresponding value lines, it's possible to use a rules-based adapter approach, if all the header element sequences are known. For some product logs, it's impossible to develop a rules-based adapter approach to convert from a specific log format to a Common Base Event format. The following product log is an example of where a rules-based adapter approach cannot be used.

z/OS Syslog sample

The log displayed in Listing 11 is a sample log generated by zOS Syslog. This sample log illustrates split messages. Logs for other products, such as the IBM DB2® diagnostic logs, use the split message form. The record type in a z/OS Syslog can be identified by the starting character of the record line. If the line starts with a starting character "M," it indicates the start of a multiline record. If a record line starts with a character "D," it's an indication of a continuation of some multiline record. If a line starts with a character "E," it's an indication of an end of a multiline record. The log contains a lot of information that is not required in this context. Only the format information required to explain the present scenario is explained here. You can ignore the rest.


Listing 11. zOS Syslog sample
 M 0000000 SY1      04098 08:39:29.74
                00000290  IEA007I STATIC SYSTEM SYMBOL VALUES 060 D
                060 00000290          &SYSNAME.  = "SY1"
                D                              060 00000290
                &SYSPLEX.  = "LOCAL" D
                060 00000290          &DUMPQUAL. = "ALPS"
                M 4040000 SY1      04098 08:39:29.94       00000290
                ILR032I PAGE DATA SET HAS BEEN USED BY ANOTHER SYSTEM:
                062 D                              062 00000290   DATA
                SET NAME  - SYS1.PLPA.PAGCOM D
                060 00000290          &SYSNUM.   = "1" E
                060 00000290          &SYSR3.    =
                "ZDR16Y" D                              062
                00000290   SYSTEM NAME    - SY1 D
                062 00000290   VM USERID      - SBAILEY E
                062 00000290   DATA SET LAST UPDATED AT 06:02:59 ON
                04/03/2004 (GMT) 

Description of the log format: The log displayed in Listing 11 is a multiline record sample; a multiline record can be identified with the start character "M." A multiline record in z/OS Syslog starts with a control line as the first line with the multiline ID at the end of the text. (For example, the first line displayed in Listing 11 with the multiline ID is 060.) Subsequent lines have the same connect ID in the header. The ID is needed to link the other lines with the original message. This type of message format can be termed a split message format. Split messages are also indicated with a code in the first column of the record, with the text associated with the prior record. There are two split messages in this example. "D" stands for "data line" and indicates that it is part of a multiline message. "E" stands for "end line" and indicates that it is the last line of the message. Both D and E line messages contain the connection IDs 060 and 062, which you can use to find out which multiline message they belong to.

In the case of the z/OS sample log, the complete record is not logged at once; the log record is available in parts. While performing the log conversion from a product-specific form to a Common Base Event form, it is necessary to hold the record until its end part is encountered. Because holding the record in memory is not possible when using a rules-based adapter approach, the only option is to use a static adapter approach.

How to convert to a Common Base Event format: The implementation in this case involves some normal tasks, such as creating the Common Base Event and assigning the default values of the required elements. The complicated task is the logic to identify the start of record, to hold the records in memory, and to transform the respective log records to a Common Base Event record when the end of the record is encountered.

The code snippet displayed in Listing 12 depicts the logic for identifying the type of record based on the first character of the record. The code includes the logic for further processing based on the line type. If the first character of the line is "D," the connection ID of the respective line is extracted, and the line is appended to the corresponding line in multiline hash, which stores all the multiline messages. A starting character of "E" indicates the end of the line and, therefore, the corresponding line with the respective connection ID is extracted and sent for further processing. Finally, a starting character "M" indicates the start of a new multiline message and, therefore, the respective line is stored in a multiline hash with the connection ID as the key. The following code shows how the messages are processed.


Listing 12. Code snippet for processing messages
 /**    * This
                function is used to process the log messages based on
                the first character    * of the log record.    * @param
                str this contains the string to be parsed    */   public
                void processMessage(String currentLine) {  char
                firstChar = currentLine.charAt(0) ;  boolean newLineFlag
                = false ;   switch(firstChar)  {    case 'D':    case
                'E':    case 'M':   {   // If character starts with any
                of the above mentioned character   // continue
                processing further else return and start reading next
                line.   newlineFlag = true ;   // check if the current
                line is not null   if(CompleteLine == null)   {
                CompleteLine = currentLine ;    continue ;   }
                if(firstChar  == 'D' || firstChar  == 'E' ) {   try{
                /*    * Method used to extract connection id from the
                current line.    * If the input of the function is the
                line mentioned below then    * the value of connection
                id would be 060    *
                --------------------------------------------------------
                * D             060 00000290          &SYSNAME.  =
                "SY1"    *
                --------------------------------------------------------
                */    m_strconnID = getConnectionId(currentLine) ;
                /*         * m_oMultiLine_hash variables the complete
                record as value with         * corresponding connection
                id as key. After the remaining part of         * the
                message is encountered it has to be appended to the
                existing         * original message for further
                processing         */    String mainMessageLine =
                m_oMultiLine_Hash.get(m_strconnID);    mainMessageLine
                += currentLine ;    if (firstChar == 'E')    {       //
                If the first char is E, it indicates end of the
                multiline message       // hence process the message and
                added to Common Base Event array       CommonBaseEvent
                event = processCommon Base Event(currentLine) ;
                }          else          {             // Append the
                current line message to existing multiline message line
                // and update the m_oMultiLine_hash with most recent
                message             appendExistingLine(currentLine);
                }              messages[arrayIndex] = event;
                m_oMultiLine_Hash.remove(m_strconnID);   }  }
                }catch(Exception ex){}  }else if(firstChar  == 'M')  {
                /*    * If first character is 'M' this indicates start
                of a multiline message    * Hence create a new key-value
                pair with connection id as key and current    * line as
                value and added it to the object m_oMultiLine_Hash
                object. The             * logic for the same has to be
                implemented in the below mentioned function    */
                createMultiLineMessage(CompleteLine) ;  } }   

Conclusion

The conversion of a product log to a Common Base Event format is possible in several different ways. Always investigate using rules-based adapters first. Choose static adapters when your log entries are unstructured and have no simple detectible patterns, eye-catchers, or delimiters to locate and extract data.


Resources

Learn

Get products and technologies

Discuss

About the author

Author photo

Hari H. Krishna is a Software Engineer at IBM focusing on the Log and Trace Analyzer for Autonomic Computing. He has more than five years of experience in the development of log analysis and reporting based applications. He holds a Masters degree in Computer Science from Osmania University, Hyderabad, India. He can be reached at harkrish@in.ibm.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in

If you don't have an IBM ID and password, register here.


Forgot your IBM ID?


Forgot your password?
Change your password


By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. This profile includes the first name, last name, and display name you identified when you registered with developerWorks. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)


By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli
ArticleID=91322
ArticleTitle=Create a static adapter for use with the Generic Log Adapter
publish-date=08092005
author1-email=harkrish@in.ibm.com
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).