Use InfoSphere Guardium Universal Feed to create a customized data activity monitoring solution, Part 2: Creating a feed for a user-defined data source

New databases and new applications are continually being created and adopted to meet specific organizational needs. The requirement for data protection and auditing capabilities is required by mandate and is more critical than ever. The InfoSphere® Guardium® data protection solution is extensible to enable the integration of a variety of new databases and sources into its platform, providing a consistent enterprise-wide monitoring solution. In this series, you learn how to integrate event logs from any software into InfoSphere Guardium using Guardium Universal Feed. Part 1 describes how to create a feed for a relational data source. In this article, you will learn how to create a feed for any arbitrary data source. Learn to upload event descriptions into the Guardium repository and how to use the reporting capability for event data. This article includes a sample event description file and sample program.

Share:

Indrani Ghatare (indrani@us.ibm.com), Software Engineer, InfoSphere Guardium, IBM

Indrani GhatareIndrani Ghatare has been a software developer at IBM for more than 12 years. Currently, Indrani is a member of the Research and Development team for InfoSphere Guardium Collector. Indrani has worked on the development of the MongoDB parser and the logger component of Guardium Collector.



Joe DiPietro (Joe_DiPietro@us.ibm.com), IBM InfoSphere Data Governance Center of Excellence Leader, IBM

Author photo of Joe DiPietroJoe DiPietro is IBM InfoSphere Data Governance Center of Excellence Leader. Joe has over 25+ years experience in security and network design and implementation. Prior to IBM and Guardium, he worked at security pioneer Check Point Software for 8+ years. Previously, DiPietro was corporate systems engineer for SynOptics Communications and a member of the company's World Wide Technical Counsel (WWTC). Joe holds a Masters degree in Computer Science, a Master of Arts, and a Bachelors degree in Mechanical Engineering.



Ury Segal (usegal@ca.ibm.com), Senior Technical Staff Member, InfoSphere Guardium, IBM

Ury SegalUry Segal is an IBM senior technical staff member since Guardium was acquired by IBM in 2009. He has been with Guardium since 2003. Currently, Ury focuses on universal event collection and other advanced Guardium capabilities.



Kathryn Zeidenstein (krzeide@us.ibm.com), InfoSphere Guardium Evangelist, IBM

Photo of Kathryn ZeidensteinKathy Zeidenstein has worked at IBM for a bazillion years. Currently, she is working as a technology evangelist for InfoSphere Guardium data activity monitoring, based out of the Silicon Valley Lab. Previously, she was an Information Development Manager for InfoSphere Optim data lifecycle tools. She has had roles in technical enablement, product management and product marketing within the Information Management and ECM organizations at IBM.



08 November 2012

Also available in Portuguese

Introduction

InfoSphere Guardium provides a comprehensive data activity monitoring and protection solution, and includes support for a wide range of databases, file shares, and other systems, such as Hadoop and Microsoft® Sharepoint. In most cases, the solution relies on lightweight software probes (S-TAPs) to monitor transactions, including those of privileged users. The monitored activity is sent to the InfoSphere Guardium appliance and stored in its internal database. The information can be used for audit reporting, real-time alerts, and much more.

The key for separation of duties is for security and event logs to be stored externally from the originating database or application system. With InfoSphere Guardium, event data is sent to a secure appliance, known as a collector, and stored in an internal database there for reporting and alerting. With the Universal Feed capability included with InfoSphere Guardium, you can integrate auditable event data into your current Data Activity Monitoring (DAM) environment by sending it to the collector and storing it there.

The Universal Feed has a couple of options for supporting different sources of activity monitoring:

  • The first option, described in Part 1, is targeted for activity that can easily integrate into the existing internal InfoSphere Guardium tables. You may see this referred to as a Type 1 feed. This would typically mean some kind of database source, since InfoSphere Guardium specializes in support for database activity. For more information about the entities and attributes within the InfoSphere Guardium system, see the product help book Appendix on that subject.
  • The other option, described here, enables you to integrate any arbitrary data source activity, by enabling you to create your own table structure in the Guardium database for storing feed messages. You may see this referred to as a Type 3 feed. The sample code in this article is a Universal Feed agent for an SSH log.

There are a number of benefits of using the Universal Feed to store the audit data off of the actual device that is monitored:

  • Audit and log information cannot be erased to cover nefarious breaches to the device.
  • Separation of duties can be maintained to ensure that correct audit information is captured.
  • Privileged users don't have access to the audit logs if they decided to tamper or alter this information.

The Universal Feed agent, running on the host, will send information to the InfoSphere Guardium appliance, as shown in Figure 1. This article and the included sample code should help you develop your own agent.

Figure 1. Universal Feed overview
Image shows agent developer builds agent, which sends messages to Guardium collector

In this article, you will:

  • Learn how to develop Universal Feed agent using the InfoSphere Guardium message protocol and the dynamic event description file format in protocol buffers. This article is written for Java™ programmers whose skills and experience are at a beginning to intermediate level. You need to have some familiarity with InfoSphere Guardium and Google's Protocol Buffers.
  • Learn how to configure the InfoSphere Guardium collector to create a table structure by uploading your message protocol buffer using its web UI.
  • Learn how to create custom reports on the InfoSphere Guardium UI to view your audit data.

This article contains step-by-step instructions to develop and configure your agent. To run the sample agents, you can download the universal_3.zip file (see Downloads). The README file includes directions for compiling and running the code. The sample program is developed and tested on a Linux® environment. You may need to modify some scripts if you run a different OS.


Agent requirements

For testing your code and creating reports, you need an InfoSphere Guardium appliance configured as a collector, at Version 9.0 or higher.

The Universal Feed agent you create runs on the system where your audit data resides. Your agent connects to port 16016 (the default InfoSphere Guardium collector port on UNIX® systems) and can persist the connection to send large number of messages. It's important that any firewall port between the agent system and the collector be opened so that the agent can connect to the collector.

Your agent will need to use Handshake, Ping, and Event messages. Event messages are how your agent sends the actual monitoring data and have the following format:

  • A 60-byte header plus data of length as specified in the header.
  • Message type must be g. The lowercase g distinguishes this from a Type 1 (database-type) feed message. Note that Handshake and Ping messages use uppercase G, described below.
  • Vendor ID must be > 10000. (Handshake and Ping messages use a vendor ID of 0, as described below.)
  • Data must be a serialized protobuf message.
  • The protobuf message can be any message named Event, which can contain anything that you want.

The code snippet from sample program in Listing 1 shows the message protocol header and data.

Listing 1. Message protocol header and data
public ByteBuffer toByteBuffer() {
    byte[]  bytes = message.toByteArray();
    ByteBuffer bb = ByteBuffer.allocate(Pos.Body.pos +  bytes.length);
                
                                        // Offset (bytes)
    bb.put(msgType);                    // 0  Must be 'g'
    bb.put((byte) 0x00);                // 1  padding, must be 0
    bb.putShort((short)  bytes.length); // 2-3 dataLen
    bb.putInt(0x01000000);              // 4-7 mark
    bb.putInt(getUnixTime(date));       // 8-11 time stamp in seconds
    bb.putInt(0x00000007);              // 12-15 protocol_version
    bb.putInt(vendor);                  // 16-19 vendor must be 
                                        // greater than 10000
    bb.put(new byte[40]);               // 20-59 legacy
    bb.put(bytes);			            // Put the serialized 
                                        // protobuf message here
    bb.rewind();
                
    return bb;
}

The Handshake message:

  • Must have a 60-byte header plus data of length specified in the header.
  • Must use a message type of G (uppercase G) and a vendor ID of 0.
  • Allows for the collector to register the name of the Universal Feed agent.
  • Turns the agent "green" in the InfoSphere Guardium UI system view so that you know it is operational. To get to the status monitor as an admin, go to the System View tab and select S-TAP Status Monitor, as shown in Figure 2:
    Figure 2. Row in System View report showing the operational Universal Feed agent
    The System View report shows green and the name of the agent, timestamp, etc.

The InfoSphere Guardium Ping message helps keep the connection persistent between the agent and the collector. The Ping message:

  • Must have a 60-byte header plus data of length as specified in the header.
  • Must use a message type of G (uppercase G) and Vendor ID of 0.
  • Must be sent every 30-60 seconds.

NOTE: Your agent must read everything the InfoSphere Guardium appliance sends to the agent. For example, after the handshake message, the appliance will send the current audit policy on the appliance to the Universal Feed agent. The agent can then ignore those messages or optionally (for more advanced agents) process this information to identify relevant details on how to configure the Universal Feed agent's behavior.

Figure 3 is a diagram of the message flow.

Figure 3. Universal Feed message processing overview
Image shows flow of TCP socket open, to appliance, Handshake to appliance, Ping message to appliance, and Event message to appliance

Step 1: Identify your audit information for event notification

Identify what you want to audit. The audit information can be anything in your system.

This article's sample SSHD log agent sends events from SSHD log /var/log/secure on a Linux system. The number of events you want to send is configurable; you can send events from the last n entries from /var/log/secure.


Step 2: Prepare your Event message

Use Google Protocol Buffers for implementing your agent. Prepare your Event message in a .proto file following the requirements described here:

  • The .proto file includes a message with the name Event, which is a collection of possible messages, as shown here:
    Listing 2. Event message definition in .proto file
    message Event {
    optional message_A m1 = 1;
    optional message_B m2 = 2;	
    }
    message message_A {
    optional string name = 1;
    }
    message message_B {
    optional string name = 1;
    }
  • The Event message fields must be defined as optional and not as required or repeated.
  • Only one interface can be defined per .proto file name or Vendor ID.
  • The message name and enumerator name are unique for each .proto file.
  • All messages related to an interface must be defined in one .proto file.
  • NOTE: Each .proto file must be in a separate Java package.

Listing 3 shows the .proto file defined for SSHD log agent.

Listing 3. Sample .proto file for an SSHD log event
package com.guardium.proto.datasource.test.type3.events10002;
                
message Event {
optional SSHDLogEvent sshde = 1; //SSHD Log  Message
                }
                
message SSHDLogEvent {
optional string timestamp = 1;    // timestamp example: 07/30/2012 16:24
optional string hostName = 2;     // Hostname of computer where event 
                                  // occurred.
optional string processName = 3;  // Process name. Example: sshd
optional int32 processID = 4;     // Process ID. Example: 32573
optional string messageFromSSHDaemon = 5;
                                  // Message from SSH daemon. 
                                  // Example:Accepted password for root 
                                  // from 9.30.150.211 port 56394 ssh2
                }

Step 3: Compile your .proto file

Compile your .proto file with protoc the Protocol Buffer compiler developed by Google, to produce code in Java technology. The sample agent program uses protoc to produce Java code. You can also produce C++ to implement your agent in C++. To produce the code in Java technology, run the following protoc command:

protoc -I=.
--java_out=. protoFileName

For the SSHD log agent, we used the name events10002.proto to produce the Java code by running the command below. This will produce a Java file called Events10002.java in the directory based on the package name used in events10002.proto. For example, if the Java package name used in the .proto file is com.guardium.proto.datasource.test.type3.events10002, then the following command will create com/guardium/proto/datasource/test/type3/events10002/Events10002.java.

protoc -I=. --java_out=. events10002.proto

Step 4: Use the created Java file to develop your Universal Feed agent

You can use the sample code we provide to help you develop your own agent. The sample code includes generic infrastructure/utility code and specific code for the SSHD log agent.

Infrastructure and utility code

The generic code consists of the following classes:

  • src/com/guardium/proto/datasource/DatasourceMessageUtil.java— This class is used to assist in building Guardium messages to send to the appliance from the Universal Feed agent. You can use your other Java classes to call the methods in this class to easily build your Guardium messages.
  • src/com/guardium/proto/datasource/Socket.java— This class is used to open a TCP socket to the Guardium appliance from the Universal Feed agent.
  • src/com/guardium/proto/datasource/Wrapper.java— This class is used to "wrap" the data into the proper message format.

Agent application-specific code

The code specific to the SSHD log agent is in the following files:

  • src/com/guardium/proto/datasource/test/type3/protofiles/events10002.proto contains the message definitions.
  • src/com/guardium/proto/datasource/test/type3/events10002/Events10002.java is generated from events10002.proto.
  • src/com/guardium/proto/datasource/test/type3/AddEvent2.java contains send event code for SSHD log agent.
  • README includes information about how to set up the environment and run the log agent.
  • RunAddEvent2.sh is the script to run the SSHD log agent.
  • src/com/guardium/proto/datasource/test/type3/SendPingMessage.java can send a Guardium Ping message every 30 seconds. If you keep this program running, it keeps the NewUniversalFeedLogger entry green in the Guardium web console.

Step 5: Upload the .proto file into Guardium collector

Guardium stores event information from the Type 3 feed in an internal database called DIST_INT (distributed interface). To configure the DIST_INT database for your messages, you must upload the .proto file you defined previously. Follow these instructions to upload the .proto file into the database. Figure 4 shows the fields you need to enter.

  1. Log onto the InfoSphere Guardium web Console as an admin.
  2. Go to Administration Console > Guardium Definitions > Distributed Interface > New.
  3. Enter the correct information for your agent as shown below:
    Figure 4. Guardium collector web interface details for uploading .proto file into DIST_INT database
    Alternative text for image

    Vendor ID— This must be greater than 10000 and is the same value you use for Vendor ID in message protocol in agent application to send events to the collector. For the SSHDL log sample event, the value used is 10002. The value can be changed and passed through command-line option of AddEvent2, but remember that the same value is used when you upload the file via web interface.

    Domain name— You can provide any name for this custom domain. This domain will be available for creating custom reports for your audit data. The domain contains all the entities for creating reports.

    File name— The file name is the name of the file where you have defined your event message and from the same file you generated Java code to use in agent implementation.

    Database name— This is Guardium's internal database where your event information is stored. The name of the database is DIST_INT. You always are required to enter the name as DIST_INT.

  4. Click OK to save the file. After it has been processed successfully, click Apply.

Step 6: Restart inspection engines

An Inspection Engine in Guardium defines what traffic is to be monitored by Guardium and what traffic is to be ignored. To ensure that the inspection engines are aware of the new source (that is, the Universal Feed agent), you need to restart those inspection engines. You can do it from Guardium Web Console or using a Guardium command-line interface command.

From the Guardium Web Console, restart inspection engines by navigating to the Administration Console and clicking Inspection Engines and then Restart Inspection Engines, as shown in Figure 5.

Figure 5. Restart Inspection Engine from Web Console
Image shows restarting inspection engines from web console

Alternatively, you can log into the appliance as a CLI user and run the following CLI command: restart inspection-core.


Step 7: Run your agent

Now you are ready to run your agent application you developed to send events to the collector. When you send events to the collector, events get processed, and event information gets stored into Guardium database (DIST_INT). That information is then accessible to be used in creating audit reports, alerts, and more.

To send event, run the following command:

Java com/guardium/proto/datasource/test/type3/AddEvent2 -host 9.70.145.62 
    -vendorID 10002 -log sampleSecureSSHDLog.txt

Alternatively:

./RunAddEvent2.sh

Make sure to replace the host IP with your Guardium collector's IP when you run the agent. Refer the README file for more information to see what command you need to run for the SSHD agent.

A sample run output of SSHD log agent is shown in Listing 4, which shows a Handshake message, a Ping message, and sshde (SSHDLogEvent) message. The event .proto file is shown in Listing 3.

Listing 4. Message protocol header and data
Processing protobuf file eventinfo
*** WRITE ***
Type1, vendor=0
type: HANDSHAKE
handshake {
  timestamp {
    unix_time: 1348939711
  }
  client_identifier: "NewUniversalFeedLogger"
  current_master: "NewUniversalFeedCollector"
  current_master_ip: 16843009
  product: "event"
  transient: false
}
                
*** WRITE ***
Type1, vendor=0
type: PING
ping {
  timestamp {
    unix_time: 1348939711
  }
  client_identifier: "NewUniversalFeedLogger"
  current_master: "NewUniversalFeedCollector"
  current_master_ip: 4027124234
}                
                
*** WRITE ***
Type3, vendor=10002
sshde {
timestamp: "Aug  2 23:04:55"
hostName: "indrani-stap"
processName: "sshd"
processID: 8668
messageFromSSHDaemon: "Invalid user guardium from 9.65.35.38 Aug 2 23:04:55 
                      indrani-stap sshd[8669]: input_userauth_request: 
                      invalid user guardium"
}
                
*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:04:57"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8668
  messageFromSSHDaemon: "Failed none for invalid user guardium from 
                        9.65.35.38 port 52326 ssh2"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:04:58"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8668
  messageFromSSHDaemon: "Failed password for invalid user guardium from 
                        9.65.35.38 port 52326 ssh2"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:00"
  hostName: "indrani-stap"
  processName: "last"
  processID: 0
  messageFromSSHDaemon: "message repeated 4 times Aug 2 23:05:00 indrani-stap 
                        sshd[8669]: Disconnecting: Too many authentication 
                        failures for guardium"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:17"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8674
  messageFromSSHDaemon: "Failed password for root from 9.65.35.38 port 52327 ssh2"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:19"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8674
  messageFromSSHDaemon: "pam_unix(sshd:auth): authentication failure; 
                        logname= uid=0 euid=0 tty=ssh 
                        ruser= rhost=sig-9-65-35-38.mts.ibm.com user=root"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:21"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8674
  messageFromSSHDaemon: "Failed password for root from 9.65.35.38 port 52327 ssh2"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:32"
  hostName: "indrani-stap"
  processName: "last"
  processID: 0
  messageFromSSHDaemon: "message repeated 3 times"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:32"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8675
  messageFromSSHDaemon: "Disconnecting: Too many authentication failures for root"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:32"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8674
  messageFromSSHDaemon: "PAM 3 more authentication failures; 
                        logname= uid=0 euid=0 tty=ssh 
                        ruser= rhost=sig-9-65-35-38.mts.ibm.com user=root"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:32"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8674
  messageFromSSHDaemon: "PAM service(sshd) ignoring max retries; 4 > 3"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:52"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8680
  messageFromSSHDaemon: "Accepted password for root from 9.65.35.38 port 52328 ssh2"
}

*** WRITE ***
Type3, vendor=10002
sshde {
  timestamp: "Aug  2 23:05:52"
  hostName: "indrani-stap"
  processName: "sshd"
  processID: 8680
  messageFromSSHDaemon: "pam_unix(sshd:session): session opened for 
                        user root by (uid=0)"
}

Step 8: Create and view audit reports

You can create custom reports to view the audit information being sent to the collector. Report building in Guardium is based on domains and entities. Domains are views of a certain set of information, and entities are fields within that domain. Because you are not using the regular Guardium tables to store your event data, your report will be created based on the custom domain you created in a previous step. The entities are the fields from your event message. Here are the steps:

  1. Go to Tools > Custom Query Builder > Domain Finder and select your custom domain, as shown in Figure 6.
    Figure 6. Domain selection for creating report
    Image shows SSHD log event domain is selected
  2. In the Custom Query Builder, select Query Finder > Main Entity and click New, as shown in Figure 7.
    Figure 7. Custom Query Builder — Query Finder
    Image shows Query Finder
  3. On the New Query — Overall Details screen, enter information for Query Name and Main Entity. For the sample log event, you can see in Figure 8 we've selected the main entity as DI_10002_SSHDLogEvent and given it a query name of sshd log. Guardium generates the entities based on the message definition you have in the .proto file. In the sample, the events10002.proto file has a message called SSHDLogEvent; in Guardium, that is given an entity name of DI_10002_SSHDLogEvent. Click Next.
    Figure 8. Custom Query Builder — New query overall details
    Image shows new query details
  4. From the Custom Query Builder, create a custom report as shown in Figure 9 by dragging entity fields into the Query Fields pane and clicking Save.
    Figure 9. Custom Query Builder — Create Report
    Image shows Create Report
  5. Finally, to add this report to a pane, click Add to Pane and select Guardium Monitor, or whatever other pane you want to add it to.

Note that in Figure 9 and Figure 10, you can see the metadata of the entities that got generated in DIST_INT database from the uploaded events10002.proto file originally shown in Listing 2.

Figure 10. Generated metadata
Image shows each field in the message shown in the report builder interface as entities

To view the report, go to the Guardium Monitor tab and click on the report you created. Figure 11 shows a report created for SSHD log event.

Figure 11. SSHD log report
The report shows sshd logevent ID, two timestamp columns, log ID, host name, process name, process ID, and messagefrom SSHDaemon

(View a larger version of Figure 11.)


Troubleshooting

You can follow these tips for troubleshooting in case you can't view the data in the reports:

  • Make sure any firewall ports between the agent and collector systems are opened.
  • Make sure you have restarted the Inspection Engine after uploading the .proto file onto Administration Console > Guardium Definitions > Distributed Interface.
  • Make sure you have uploaded the same .proto file used to create Java code for developing agent.
  • Make sure you have Start Date (QUERY_FROM_DATE) and End Date (QUERY_TO_DATE) set to the right value in the report you created.
  • Debug using tcpdump.
  • Debug using the slon utility. (See the InfoSphere Guardium help book for more information about this diagnostic utility.).

Summary

This article and sample code should help you address the challenges of meeting audit and compliance requirements over an ever-increasing range of data events. The InfoSphere Guardium data protection and compliance solution is extensible to enable the integration of a variety of new databases and sources into its platform, thereby providing a consistent enterprise-wide monitoring solution.


Download

DescriptionNameSize
Sample Universal Feed agentuniversal_3.zip184KB

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks.
  • Now you can use DB2 for free. Download DB2 Express-C, a no-charge version of DB2 Express Edition for the community that offers the same core data features as DB2 Express Edition and provides a solid base to build and deploy applications.

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Security
ArticleID=844865
ArticleTitle=Use InfoSphere Guardium Universal Feed to create a customized data activity monitoring solution, Part 2: Creating a feed for a user-defined data source
publish-date=11082012