Transforming and analyzing log files
To allow ongoing monitoring, a complex application -- and certainly an application expected to run continuously -- is typically instrumented during development to emit a log file, which is a record of application activity. Some activity can be detailed internal diagnostics, which is information crucial for isolating a bug or untangling interactions with other system and software components. Some activity logged might be initiated by the application itself -- say, to read a configuration file or to open a port for listening. Other activity might be generated by requests for service.
Depending on the application's purpose, a systems administrator might review the program's corresponding log file from time to time -- when an error occurs or even in real time to react to emergent events. Logs are often full of valuable historical information, too. Think of the traffic and usage patterns found only in Apache HTTP Server logs, for instance.
It would be ideal if all log files captured at least a minimum of information. It would be even better -- certainly from a systems administrator's point of view -- if the format of all log files was uniform. Consistency would make reading logs far easier, and homogeneity would certainly facilitate (not to mention cheapen the expense of) the development of automated tools that weed out vital events from the informational.
But invariability is not reality. Applications differ greatly (as do underlying operating system facilities and programming language libraries). Some applications are entrenched and cannot be revised ("legacy applications") to be brought to uniformity. And it's an ugly truth that expensive and scarce developer cycles are usually spent on new features, not retrofits.
Short of the ideal and realizing that one solution cannot ever fit all, it is far more practical to transform log file data to meet evolving standards, de-facto or otherwise, and to apply state-of-the-art analysis tools. For example, the CBE format is part of an effort to define a broad standard for recording, tracking, and analyzing events, which are occurrences and situations that take place in computing systems. Many tools exist to process and analyze CBE data, which is based on XML.
But while transformation from arbitrary log file to CBE may be practical, the process may not be easy or inexpensive. Given the variety of applications and the sheer number of log file formats, writing so many transforms can be a Herculean task in itself.
The Eclipse TPTP GLA and Adapter Configuration Editor simplify the creation of transforms, thereby easing the migration to CBE. The GLA applies an adapter created by the Adapter Configuration Editor to a log file and yields CBE data. The Adapter Configuration Editor can run a handmade Java class if need be -- a static adapter -- or it can run a series of rules to divide the log file into records, fields, and values and reassemble them as CBE data. The latter form of adapter is a rules-based adapter and requires no coding. Better yet, the Adapter Configuration Editor runs in Eclipse and provides a rich adapter development environment in which you can incrementally define and test your adapter. Finally, you can choose to integrate the GLA with your own code or use third-party tools, such as the IBM Log and Trace Analyzer, to probe and investigate the resulting CBE event files.
This tutorial shows how to use the capabilities of the Eclipse TPTP GLA and Adapter Configuration Editor to convert a typical Linux application log file to CBE events. With a log file in hand and a little regular expression know-how, you can transform the log into a unified, structured CBE format.