Editing the extractor
The role of the sensor is to collect input. The role of the extractor is to divide the incoming input stream into individual records. (The next component in the chain -- the parser -- divides each record into fields.)
Configure the extractor properties
To edit the extractor, click Extractor. Its properties are shown in Figure 6. The properties of the extractor specify the delimiters of each record and control whether those delimiters should be included in the record passed on to the parser.
Figure 6. The extractor properties
In the example log file, daemon.log, each line of the log is a separate event. This makes the extractor particularly easy to configure. (Figure 6 is the appropriate configuration for daemon.log.)
- The Contains line breaks check box is cleared, because each line in daemon.log is a record. However, if an entry were to span many lines, as is the case with MySQL or IBM DB2® database logs, you'd select this check box.
- The Replace line breaks check box is also cleared in this example. If the log file contained line breaks, though, you could select this check box to either delete each line break or replace each one with a special marker -- useful for parsing. To delete line breaks, simply select the check box; to replace each line break with a token, select the check box and provide the delimiter in the Line break symbol field. It's best to choose a symbol that doesn't appear in the log file.
- The Start pattern and End pattern are regular expressions that describe the start and end of each record. Here, where each line is a record, the beginning of the line, or
^(caret), marks the start of the record. The end of the line, or$(dollar sign), marks the end of each record. Because^and$do not capture any content, neither need be included in the record itself.
Save your work before continuing.
For comparison, create another example extractor for MySQL's slow query log, a special log used to capture suboptimal queries. Each entry in the slow query log spans at least three lines (see Listing 13).
Listing 13. A snippet of MySQL's slow query log
# Time: 030207 15:03:33
# Query_time: 13 Lock_time: 0 Rows_sent: 0 Rows_examined: 0
SELECT l FROM un WHERE ip='209.xx.xxx.xx';
# Time: 030207 15:03:42
# Query_time: 17 Lock_time: 1 Rows_sent: 0 Rows_examined: 0
SELECT l FROM un WHERE ip='214.xx.xxx.xx';
# Time: 030207 15:03:43
# Query_time: 57 Lock_time: 0 Rows_sent: 2117 Rows_examined: 4234
SELECT c,cn,ct FROM cr,l,un WHERE ci=lt AND lf='MP' AND ui=cu;
|
An extractor for the slow query log might look something like Figure 7.
Figure 7. A sample extractor for the MySQL slow query log
Figure 8 shows the second of the three records, each successfully processed by the extractor.
Figure 8. An extracted record from the slow query log
Returning to the daemon.log adapter, you can now test the sensor and extractor components to verify that data is being acquired and divided into records.
Glance at the two panes at the bottom of the Generic Log Adapter perspective. You should see something resembling Figure 9. At left is the Extractor Result pane; at right, layered, are the Formatter Result pane, the Sensor Result pane, and the Problems pane. A series of buttons that control the adapter appear within the Extractor Result pane. Figure 10 labels the buttons (or you can slowly mouse over each button to see a tool tip.)
Figure 9. Context components display panes
Figure 10. The adapter control buttons
Click Rerun adapter to restart processing from the beginning of the log file template. Then click Next event to process the first event.
- The Sensor Result pane should show the first 10-20 lines of the log file.
- The Extractor Result pane should show the first line of the log file,
Mar 2 06:27:35 db popa3d[7964]: Session from 71.65.224.25. - The Problems pane should be empty. However, pay close attention to this pane whenever you run your adapter. If you've omitted required CBE properties, specified an illegal regular expression, or used an unsupported value, this pane should point those out.
- The Formatter Result pane is irrelevant because a parser has yet to be defined. However, it does show an initial XML CBE for the current record:
Listing 14. Initial XML CBE for current record<CommonBaseEvent creationTime="replace with our message text" globalInstanceId="A1DAABE6C7876D20E8E9E8C475042F1B" version="1.0.1"> </CommonBaseEvent>
As you'll see, as you define your parser, additional elements and attributes will automatically be added to the XML.
To have the extractor produce the next record, click Next event again. To fast-forward to the last record (in the input the sensor has collected so far), click Show last event.



