Building regular expression patterns

To create a log source extension, you use regular expressions (regex) to match strings of text from the unsupported log source.

About this task

The following example shows a log entry that is referenced in the steps.

May 20 17:24:59 kernel: DROP MAC=<MAC_address> 
SRC=<Source_IP_address> DST=<Destination_IP_address> LEN=351 TOS=0x00 PREC=0x00 TTL=64 ID=9582 
PROTO=UDP SPT=67 DPT=68 LEN=331
May 20 17:24:59 kernel: PASS MAC=<MAC_address> 
SRC=<Source_IP_address> DST=<Destination_IP_address> LEN=351 TOS=0x00 PREC=0x00 TTL=64 
ID=9583 PROTO=TCP SPT=1057 DPT=80 LEN=331 
May 20 17:24:59 kernel: REJECT 
MAC=<MAC_address>  SRC=<Source_IP_address> DST=<Destination_IP_address> LEN=351 
TOS=0x00 PREC=0x00 TTL=64 ID=9584 PROTO=TCP SPT=25212 DPT=6881 LEN=331 

Procedure

  1. Visually analyze the unsupported log source to identify unique patterns.

    These patterns are later translated into regular expressions.

  2. Find the text strings to match.
    Tip: To provide basic error checking, include characters before and after the values to prevent similar values from being unintentionally matched. You can later isolate the actual value from the extra characters.
  3. Develop pseudo-code for matching patterns and include the space character to denote the beginning and end of a pattern.

    You can ignore the quotes. In the example log entry, the event names are DROP, PASS, and REJECT. The following list shows the usable event fields.

    • EventName: " kernel: VALUE "
    • SourceMAC: " MAC=VALUE "
    • SourceIp: " SRC=VALUE "
    • DestinationIp: " DST=VALUE "
    • Protocol: " PROTO=VALUE "
    • SourcePort: " SPT=VALUE "
    • DestinationPort: " DPT=VALUE "
  4. Substitute a space with the \s regular expression.

    You must use an escape character for non-digit or non-alpha characters. For example, = becomes \= and : becomes \:.

  5. Translate the pseudo-code to a regular expression.
    Table 1. Translating pseudo-code to regular expressions
    Field Pseudo-code Regular expression

    EventName

    " kernel: VALUE

    "

    \skernel\:\s.*?\s

    SourceMAC

    " MAC=VALUE "

    \sMAC\=(?:[0-9a-fA-F]{2}\:){5}[0-9a-fA-F]{2}\s

    SourceIP

    " SRC=VALUE "

    \sSRC\=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s

    DestinationIp

    " DST=VALUE "

    \sDST\=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s

    Protocol

    " PROTO=VALUE "

    \sPROTO\=(TCP|UDP|ICMP|GRE)\s

    SourcePort

    " SPT=VALUE "

    \sSPT\=\d{1,5}\s

    DestinationPort

    " DPT=VALUE "

    \sDPT\=\d{1,5}\s

  6. Specify capture groups.

    A capture group isolates a certain value in the regular expression.

    For example, in the SourcePort pattern in the previous example, you can't pass the entire value since it includes spaces and SRC=<code>. Instead, you specify only the port number by using a capture group. The value in the capture group is what is passed to the relevant field in IBM® QRadar®.

    Insert parenthesis around the values you that you want capture:

    Table 2. Mapping regular expressions to capture groups for event fields
    Field Regular expression Capture group

    EventName

    \skernel\:\s.*?\s

    \skernel\:\s(.*?)\s

    SourceMAC

    \sMAC\=(?:[0-9a-fA- F]{2}\:){5}[0-9a-fA-F]{2}\s

    \sMAC\=((?:[0-9a-fA- F]{2}\:){5}[0-9a-fA-F]{2})\s

    SourceIP

    \sSRC\=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s

    \sSRC\=(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s

    Destination IP

    \sDST\=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\s

    \sDST\=(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s

    Protocol

    \sPROTO\=(TCP|UDP|ICMP|GRE)\s

    \sPROTO\=((TCP|UDP|ICMP|GRE))\s

    SourcePort

    \sSPT\=\d{1,5}\s

    \sSPT\=(\d{1,5})\s

    DestinationPort

    \sDPT\=\d{1,5}\s

    \sDPT\=(\d{1,5})\s

  7. Migrate the patterns and capture groups into the log source extensions document.

    The following code snippet shows part of the document that you use.

    
    <device-extension xmlns="event_parsing/device_extension"> 
    <pattern id="EventNameFWSM_Pattern" xmlns=""><![CDATA[%FWSM[a-zA-Z\-]*\d-(\d{1,6})]]></pattern>
    <pattern id="SourceIp_Pattern" xmlns=""><![CDATA[gaddr (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/([\d]{1,5})]]></pattern> 
    <pattern id="SourceIpPreNAT_Pattern" xmlns=""><![CDATA[gaddr (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/([\d]{1,5})]]></pattern>
    <pattern id="SourceIpPostNAT_Pattern" xmlns=""><![CDATA[laddr (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/([\d]{1,5})]]></pattern>
    <pattern id="DestinationIp_Pattern" xmlns=""><![CDATA[faddr (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/([\d]{1,5})]]></pattern>
    <pattern id="Protocol_Pattern" case-insensitive="true" xmlns=""><![CDATA[(TCP|UDP|ICMP|GRE)]]></pattern>
    <pattern id="Protocol_6_Pattern" case-insensitive="true" xmlns=""><![CDATA[protocol=6]]></pattern> 
    <pattern id="EventNameId_Pattern" xmlns=""><![CDATA[(\d{1,6})]]></pattern>