Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Automate data collection for problem determination, Part 6: The IBM Support Assistant Lite tool

Exploring extended analysis

Bob Moore, Advisory Software Engineer, IBM
Author photo
Bob Moore is an Advisory Software Engineer with the Software Group Advanced Design and Technology team at IBM in Research Triangle Park, North Carolina. He received his Ph.D. in Philosophy from Duke University in 1977. Since joining IBM in 1983, he has worked on numerous architectures and standards related to network and systems management, including SNA/Management Services, CMIP, SNMP, and DMTF CIM. You can contact Bob at remoore@us.ibm.com.

Summary:  Discover a major extension to the IBM® Support Assistant Lite tool: extended analysis. Explore how the extended analysis function works and work through a checklist on how to set up extended analysis for your collections.

View more content in this series

Date:  30 Oct 2007
Level:  Intermediate
Also available in:   Japanese

Activity:  6152 views
Comments:  

Introduction

This article introduces the most significant extension to the analysis function since the IBM Support Assistant Lite tool was first released: extended analysis. With this extension, the analysis report is no longer limited to information that is gathered on the target system during the collection activity. Although this information is still available, the analysis report also now contains URL hyperlinks to several types of information available on the Internet, information related to the conditions reported in the log files that are analyzed. This article focuses on how the extended analysis function works and provides you with a checklist on how to set up extended analysis for your collections.

This article is the sixth in a series on the IBM Support Assistant Lite tool. Prior to its October 2007 release, the tool was known as the Automated Problem Determination (AutoPD) tool. Other developerWorks articles in this series include:

  • Part 1, which introduces the tool
  • Part 2, which shows you how to extend the tool to address additional products and problem scenarios
  • Part 3, which introduces the topic of automating log file analysis
  • Part 4,which extends the analysis discussion to cover XML-formatted log files such as those that include Common Base Events
  • Part 5, which describes how analysis can be performed at several points in a collection script, with the results combined at the end into a single analysis report.

Because extended analysis builds on the earlier basic analysis and incremental analysis functions, you will need to understand these two functions in order to add extended analysis to your collection scripts. So if your goal is to implement extended analysis, you should review Part 3 and Part 5 in this series in addition to reading this article.

With the October 2007 release of the tool, the extended analysis function applies only to the non-XML log files described in Part 3. It may, however, be extended in the future to cover the XML-formatted log files described in Part 4.

To emphasize its commonality with its larger cousin, the Automated Problem Determination (AutoPD) tool has been adopted into the IBM Support Assistant family and renamed IBM Support Assistant Lite. The functions described in this article are available in both product families. In addition to these functions, the full IBM Support Assistant tool offers additional search capabilities, free troubleshooting tools, and much more.

Overview of extended analysis

The overall goal of the extended analysis function in the IBM Support Assistant Lite tool is to change the analysis report from one that presents a list of interesting log records into one that presents a summary of the interesting contents in a log and links to additional information related to this content.

In Figure 1, which was generated by version 1.2.3 of the AutoPD tool, the analysis report simply lists each of the log records that qualify as interesting. For each of these log records, selected fields such as the time stamp and the error or warning code are displayed separately so that each one will stand out. But nothing outside of the log record itself is included. With the current release of the tool, the function that provides a list of log records is still supported and is in fact still very useful for certain logs such as WebSphere® Portal's ConfigTrace.log. But as you will see, extended analysis greatly improves the analysis report for logs such as WebSphere Portal's SystemOut.log and SystemErr.log.


Figure 1. Listing the interesting log records in a log file
Listing the interesting log records in a log file

Contrast Figure 1 with Figure 2, which was generated by version 1.2.4 of the IBM Support Assistant Lite tool. Rather than listing the interesting log records individually, the analysis report now summarizes all of the interesting records in the log. For each unique record, the report contains URL links constructed using the error code contained in the record. In this article, you will examine in detail how these URLs are constructed and how their formats can be adjusted by editing the XML documents that are included with the tool.


Figure 2. Summarizing the interesting records and providing links to additional information
Summarizing the interesting records and providing links to additional       information

Figure 2 shows a portion of the analysis results for the same log file that was used for Figure 1, the SystemOut-00.log. Each result contains these fields:

Why WebSphere Portal?

Even though the extended analysis function is fully available to script writers for any product, all of the examples in this article are taken from the WebSphere Portal collections. Why is this? There are two very good reasons. First, the WebSphere Portal collections have the most complete implementation to date of extended analysis. Second, when you go to the download site identified in Resources, the tool you download from this site contains the collection scripts and XML documents developed for WebSphere Portal. So these are the scripts and documents you can use as starting points for developing your own extended analysis content.

  • An error code that appears in the log file. The tool's matching algorithm always focuses on the first error code that appears in a given log record. The error code contains a hyperlink to online product documentation related to the code. This documentation typically includes, but is not limited to, an online message catalog.
  • A link to the complete log record. These links take the place of the right-hand column in Figure 1, where every log record containing an error code was reproduced in the analysis report.
  • The number of log records with this error code found in the log file and when the earliest and latest ones were created. The time stamps are omitted for log files that do not time stamp their log records.
  • A link to any IBM support technote related to the error code.
  • A link to any IBM authorized program analysis report (APAR) related to the error code.

These latter two links are configurable elements of the extended analysis function that will be reviewed in some detail in this article.


Configuring the analysis function

The first step in configuring extended analysis is to configure basic analysis. The outline for doing this remains unchanged from the one described in Part 3; some of the details have changed, however. As Part 3 describes in detail, configuration of basic analysis involves two parts: creating a pattern document containing <analysisProfile> elements for each combination of product version, log file, and problem type your product supports, and then invoking analysis in your collection script with parameters that identify a single <analysisProfile> element within your pattern document to be used on this occasion. These parameters were termed "scoping variables" in Part 3.

What has changed from Part 3 is the custom Ant task you use to invoke analysis, the one where you specify the scoping variables. You can compare Listing 1 from Part 3, which shows how the <infocollect> task was used in the past to specify the scoping variables, with Listing 1 included here, which shows how to accomplish the same thing with the <analyze_files_v2> task that is used to invoke the extended analysis function. There are some new attributes (described in detail in Part 5) related to the incremental analysis function, but the four scoping variables of problem type, product name, product version, and log file sets are specified just as they were for the <infocollect> task.


Listing 1. The <analyze_files_v2> task
<analyze_files_v2
   problem="${collect_wps_information_common_ProblemType}"
   patternFile=
      "${portal.shared.targets.bundle.basedir}/properties/wps/${wpsCommonPatternFile}"
   productname="${wps.product.name}"
   productversion="${wps.product.version}"
   includeType="link"
   fragmentTitle="Basic WebSphere Portal Analysis"
   timeout="${infocollector.timeout}">
      <autopdfileset filesetName="wpslog" filesetDir="${portal.latest.file}"/>
      <autopdfileset filesetName="tracelog" filesetDir="${trace.log.file}"/>
      <autopdfileset filesetName="tracelogtimestamped"
          filesetDir="${trace.log.latest.timestamped.version}"/>
      <autopdfileset filesetName="systemoutlog" filesetDir="${systemout.log.file}"/>
      <autopdfileset filesetName="systemerrlog" filesetDir="${systemerr.log.file}"/>
</analyze_files_v2>

Listing 1 also illustrates a minor change in how the pattern document itself is identified. This change is not mandatory and it doesn't necessarily rise even to the level of a best practice. It may, however, be a practice that you find convenient to follow. Previously, the pattern file to use for an analysis activity was specified explicitly by name in the custom Ant task used to invoke the analysis. For example, the following lines are taken from Part 5's Figure 4:

<analyze_files problem="${collect_wps_information_common_ProblemType}" 
   patternFile=
      "${portal.shared.targets.bundle.basedir}/properties/wps/pattern_template.xml"
   ...

In this case, the pattern file /properties/wps/pattern_template.xml is identified explicitly by name. Contrast this with the following lines from Listing 1:

<analyze_files_v2 problem="${collect_wps_information_common_ProblemType}"
   patternFile=
      "${portal.shared.targets.bundle.basedir}/properties/wps/${wpsCommonPatternFile}"
   ...

Here, the pattern file to use is identified with an Ant property value. The referenced property wpsCommonPatternFile is set in the tool's properties/wps/portal-script-initialization.properties file

wpsCommonPatternFile=pattern_template_E.xml

The reason for doing it this way is to make it easy to switch among different pattern files. With extended analysis, a pattern file provides much more flexibility than it did previously. Rather than forcing you to edit a pattern file in order to change the way analysis works, you can include several pattern files with your collection scripts and switch among them by modifying the value of this one Ant property.

In this example, WebSphere Portal includes several pattern files with its scripts: pattern_template_E.xml, pattern_template_EW.xml, and pattern_template_EX.xml. The difference between these pattern files lies in the log records that each of them regards as interesting. For pattern_template_E.xml, an interesting log record is one that contains an error message (one with a message ID ending in 'E'). For pattern_template_EW.xml, an interesting log record is one that contains either an error message or a warning message (one with a message ID ending in 'E' or 'W'). For pattern_template_EX.xml, an interesting log record is one that contains an error message or an exception message (one with a message ID ending in 'E' or 'X'). The decision of which pattern file to use thus depends on the answer to the question of whether including warnings or exceptions as well as errors in the analysis report makes the overall process of problem diagnosis either easier because key warnings or exceptions are not missed or harder because including the warnings clutters the report and tends to mask the errors. The effects of this choice of pattern files are shown in Figures 1 and 2. The analysis that produced the report in Figure 1 selected both warnings and errors, while only errors were selected for Figure 2.


Specifying the extended analysis extension

To enable extended analysis on a log file, you need to specify one or more extensions to provide additional configuration values to use in the analysis. The XML configuration of the extended analysis function involves two parts:

  • A <fileset> element in the pattern document contains one or more <additionalProcessing> child elements, indicating the specific types of additional processing that should be invoked against the log files in that fileset. These <additionalProcessing> elements contain configuration values specific to that particular fileset.
  • Each of these <additionalProcessing> elements contains an attribute file that points to a second XML document with additional details about the extended analysis. The configuration values in this second document are more general, applying to extended analysis for multiple filesets.

As an example, Listing 2 shows the <fileset> element that configured the extended analysis whose results are shown in Figure 2. This element provides instructions for log files whose names begin with SystemOut; therefore, it is the one used for Figure 2's SystemOut-00.log. Other <fileset> elements will provide instructions for log files with other names.


Listing2. A <fileset> element with an <additionalProcessing> extension

<fileset name="systemoutlog" value="SystemOut.*\.log">
   <delimiterid id="delimiter4"/>
   <additionalProcessing
      timeStampPattern="([0-9]{1,2}/[0-9]{1,2}/[0-9]{1,2}\ [0-9]{1,2}[:.]
         [0-9]{1,2}[:.][0-9]{1,2}:[0-9]{1,3}\ [A-Z]{1,4})"
      type="search"
      indicatorPattern="[A-Z]{4,5}[0-9]{1,4}E"
      searchStringPattern="([A-Z]{4,5}[0-9]{1,4}E)"
      file="properties/wps/analysis/searchUrlSpecification-Portal_v6x.xml"/>
</fileset>
 

Now, let's examine each of the attributes of this <additionalProcessing> element:

  • timeStampPattern

    In order to supply the timestamps for the earliest and latest occurrences of log records having a given error ID, the tool must be told how the log file encodes timestamps. In many cases this will be the same pattern that appears in the <delimiter> element referenced by the fileset's delimiterid attribute if the timestamp serves as the delimiter that identifies the start of a new log record. But the two values need not be the same. This attribute is optional because some log files do not contain timestamps in their log records. In this case, the report will still contain a hyperlink to the most recent log record, but there will be no timestamps for the earliest and latest occurrences.

  • type

    This attribute indicates the specific type of extended analysis to be performed on the log files in the fileset. The tool currently supports only one value for this attribute: search.

  • indicatorPattern

    This is the regular expression pattern that qualifies which log records are interesting. Any log record that contains a match for this pattern proceeds to the next step of processing, controlled by the regular expression in the searchStringPattern attribute. Log records that do not match it are not processed any further. It is here that you see the differences among the three pattern files pattern_template_E.xml, pattern_template_EW.xml, and pattern_template_EX.xml. The value of this attribute in these pattern files is, respectively, [A-Z]{4,5}[0-9]{1,4}E, [A-Z]{4,5}[0-9]{1,4}[EW], and [A-Z]{4,5}[0-9]{1,4}[EX].

  • searchStringPattern

    This pattern is applied to log records that match the indicator pattern. If a log record also returns a match on this pattern, then the first substring in the record that matches the pattern will be used in the analysis report. In the example in Listing 2, the indicatorPattern and the searchStringPattern attributes contain the same regular expression pattern (although they differ syntactically, this attribute surrounds the basic pattern with capturing parenthesis, while indicatorPattern does not). In other cases, however, they may differ. A log record may contain an initial, high-level indication that it is reporting an error, while the details of the error are represented by a different, more granular code that appears later in the log record. In such a case, indicatorPattern would be used to match the high-level indication, and searchStringPattern would then match the granular code that represents the specific condition reported in the log record.

  • file

    This attribute points to the XML document that completes the configuration of the extended analysis for SystemOut.log.

When the tool performs extended analysis on a log file, it combines the configuration values it receives in the <additionalProcessing> element with those it receives in the referenced document. Values that are (or could be) specific to a single fileset appear as attribute values in the <additionalProcessing> element. Those that apply to multiple filesets appear in the referenced document.


Extension-specific configuration documents

Continuing with the example in Figure 2, examine the document referenced by the <additionalProcessing> element's file attribute. This same document is used for several filesets (SystemOut.log, SystemErr.log, wps_<timestamp>.log) because it contains nothing that's specific to a particular log file or log file format.

Listing 3 shows the entire searchUrlSpecification-Portal_v6x.xml document that is referenced by the file attribute in Listing 2.


Listing 3. XML document with additional values for extended analysis
?xml version="1.0" encoding="UTF-8"?>
<specificationList xmlns="http://www.ibm.com/autopd/SearchUrlSpec"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.ibm.com/autopd/SearchUrlSpec
         ../../analysis/searchUrlSpecification.xsd"
      productName="IBM WebSphere Portal Server">
   <searchUrlSpecification
      hostName="www.ibm.com"
      hostPage="support/search.wss"
      presentationString="Technotes">
         <trailingTerm termType="rs" termValue="688"/>
         <trailingTerm termType="tc" termValue="SSHRKX"/>
         <trailingTerm termType="word" termValue="aw"/>
         <trailingTerm termType="wfield" termValue="!!SearchString!!"/>
         <trailingTerm termType="nw" termValue=""/>
         <trailingTerm termType="apar" termValue="exclude"/>
         <trailingTerm termType="atrwcs" termValue="on"/>
         <trailingTerm termType="rankprofile" termValue="8"/>
   </searchUrlSpecification>
   <searchUrlSpecification
      hostName="www-111.ibm.com"
      hostPage="search/SupportSearchWeb/SupportSearch"
      presentationString="Known problems (APARs)">
         <trailingTerm termType="action" termValue="search" />
         <trailingTerm termType="pageCode" termValue="MPS" />
         <trailingTerm termType="rsid" termValue="203" />
         <trailingTerm termType="products" termValue="" />
         <trailingTerm termType="sortBy" termValue="3" />
         <trailingTerm termType="pageNumber" termValue="1"/>
         <trailingTerm termType="searchTerms" termValue="!!SearchString!!"/>
         <trailingTerm termType="searchLimitsFilter" termValue="DB550"/>
         <trailingTerm termType="sortByFilter" termValue="3"/>
         <trailingTerm termType="setFilter.x" termValue="13"/>
         <trailingTerm termType="setFilter.y" termValue="13"/>
         <trailingTerm termType="setFilter" termValue="submit"/>
   </searchUrlSpecification>
</specificationList>


If you compare the contents of this document to Figure 2, it is easy to see what's going on. The first <searchUrlSpecification> element provides the instructions for building the line in the report that says Technotes. Similarly, the second element provides instructions for building the line that says Known problems (APARs). In both cases, most of the instructions detail how to build the search URLs that are hyperlinked to the text. The special string !!SearchString!! indicates to the tool where to insert the particular error code for which the search will be performed. You can see how the tool combines the <trailingTerm> elements if you examine the search URL for the first Technotes line in the report, which is shown in Listing 4.


Listing 4. Search URL for the first Technote search
http://www.ibm.com/support/search.wss?
rs=688&
tc=SSHRKX&
word=aw&
wfield=SECJ0369E&
nw=&
apar=exclude&
atrwcs=on&
rankprofile=8

It is easy to modify the scope of the search or the presentation of the search results by modifying some of the trailing terms. It is also possible to direct the search to a completely different search engine by varying the host name or host page. An XML schema file /properties/analysis/searchUrlSpecification.xsd is provided with the tool so you can verify that your XML specification document is valid.


Message Catalog Lookup

The first line in each entry in the log file shown in Figure 2 contains an error code extracted from the log records. Clicking the hyperlink takes you to IBM product documentation that includes, but is not limited to, the message catalog containing all of the individual entries for the code's message prefix. The form of this hyperlink is currently hard-coded in the tool so it does not require any explicit XML configuration. For example, the hyperlink on the first error code shown in Figure 2 (SECJ0369E) is http://www.google.com/search?q=+SECJ0369E+site%3Aboulder.ibm.com&btnG=Search. In other words, a Google search for the error code on the IBM product site in Boulder, CO.

This message catalog lookup replaces the more limited lookup function that was present in earlier versions of the tool. The previous function provided local links to message entries extracted from copies of selected message catalogs that were included in the tool's directory structure when it was extracted on a target system but the function was limited to two specific versions of WebSphere Portal. The new message catalog lookup function is better than the original one in a number of ways but it also has some disadvantages.

Advantages of the new message catalog lookup function are:

  1. It works for all error codes for all IBM software products, not just for selected error codes for two versions of WebSphere Portal.
  2. It links to other product documentation related to an error code rather than only to the message catalog.
  3. It links to current documentation for an error code on the IBM Boulder Web site rather than to a snapshot of the documentation taken at the time the tool was last released.

Disadvantages of the new message catalog lookup function are:

  1. It requires that you have an active Internet connection when viewing the analysis report.
  2. Because of a limitation in the online message catalogs, it always positions you at the top of the catalog containing the message code in question rather than on that specific code within the catalog.

Scalability of the extended analysis function

In designing the extended analysis function for the IBM Support Assistant Lite tool, one of the issues that the designers had to keep in mind was that of scalability. Because of scalability concerns, the designers were forced to reject what in some ways was the most natural way of approaching the problem of finding additional information for the problems reported in a log file: constructing a separate regular expression pattern to identify each problem type, and then, when a particular problem type was detected in the current log file, providing links to information for that specific type of problem.

The flaw in this approach is that it requires a separate regular expression pattern for each type of problem: 10 problem types require 10 patterns, 100 problem types require 100 patterns, and so on. Couple this linear growth in the number of patterns with the number of log records that can be present in a log file and with the fact that any of these log records might match any of the patterns and with the performance overhead inherent in any regular expression processing, and the designers quickly reached the conclusion that this idea would not perform well.

What the designers implemented instead is an approach that uses a single regular expression pattern to identify all problems that might be reported in a log. With this approach, there is only one regular-expression operation per log record (to "prequalify" the log record as interesting based on the indicatorPattern value), and then a second operation applied to the much smaller set of prequalified log records to see if they match the searchStringPattern value. Using this approach, the extended analysis of a log file doesn't take a whole lot longer to complete than the basic analysis that existed before.


Checklist for using the extended analysis function in your collections

This article concludes with a checklist of the steps you must take to add extended analysis to your own collections. As you review this list, it will be helpful to have in front of you the XML documents, collection scripts, and other files that configure the extended analysis function for the WebSphere Portal collections. In the IBM Support Assistant Lite for WebSphere Portal tool, these files are available in the following three subdirectories:

  • /properties/wps
  • /properties/wps/analysis
  • /scripts/wps

In the IBM Support Assistant tool these subdirectories are all available in the plug-in for WebSphere Portal. You won't be modifying these specific files (because that would change the behavior of the WebSphere Portal collections). Instead, you'll copy them to your own plug-in to use as starting points for your own configuration choices.

  1. Following the detailed instructions here and in Parts 3 and 5 of this series, get your collection scripts and your pattern file set up to do basic analysis. Remember that your scripts will need to invoke the latest version of the analysis task, <analyze_files_v2>.
  2. In each of the analysis profiles where you want to do extended analysis, add a child element <additionalProcessing> under each <fileset> element, telling the tool how to do extended analysis for that fileset. The type attribute in this <additionalProcessing> element must be set to search, and the three attributes with the regular expression patterns must have values that correspond to the contents of the log files in the fileset. Finally, the file attribute must point to a document that resides in the same IBM Support Assistant plug-in that contains your pattern file. You'll probably want to point to the same extension document from all of your <additionalProcessing> elements, but it's possible to point to different extension documents if you need to.
  3. Create your extension documents using as your model one of the documents for WebSphere Portal: /properties/wps/analysis/searchUrlSpecification-Portal_v5x.xml or /properties/wps/analysis/searchUrlSpecification-Portal_v6x.xml. You can validate your document using the schema included with the tool at /properties/analysis/searchUrlSpecification.xsd. In the IBM Support Assistant environment, this schema document is included in the shared elements plug-in. You will probably need to experiment a bit with the contents of your extension document to see which combination of fields in the search URL gives you the desired effect.

Conclusion

With the release of the extended analysis function in IBM Support Assistant Lite v1.2.4, the tool now provides you with more useful information than ever before. The hyperlinks to IBM product and support pages included for an error that has occurred on the target system provide more information to find the solution to the current problem regardless of whether you are a customer administrator or an IBM Support engineer.

The design and implementation of the extended analysis function follows the direction set in the tool's original analysis implementation: very general capabilities that are easily configurable simply by editing XML documents. The effect is to make it easy for a collection script developer to take advantage of it.


Resources

About the author

Author photo

Bob Moore is an Advisory Software Engineer with the Software Group Advanced Design and Technology team at IBM in Research Triangle Park, North Carolina. He received his Ph.D. in Philosophy from Duke University in 1977. Since joining IBM in 1983, he has worked on numerous architectures and standards related to network and systems management, including SNA/Management Services, CMIP, SNMP, and DMTF CIM. You can contact Bob at remoore@us.ibm.com.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli
ArticleID=265381
ArticleTitle=Automate data collection for problem determination, Part 6: The IBM Support Assistant Lite tool
publish-date=10302007