Contents


Analyze the analysts with IBM i2 Analyze

Use i2 Analyze auditing for an analysis-ready view of audit information

Comments

When you are dealing with data that might be sensitive, companies often find it necessary to keep an audit trail of who is accessing the data and what they are doing with it. When companies and senior staff can be held legally accountable for any fraudulent activity by their employees, the need for an audit trail is becoming more urgent than ever.

The latest release of IBM i2® Analyze features an auditing mechanism for user interactions with the IBM i2 Analyze Information Store. With this mechanism, the i2 Analyze user can log activities such as searches and expand operations. This tutorial provides an overview of the Information Store logging framework and describes how you can use it to provide an analyst-ready picture of the audit log data.

What you'll need

  • IBM i2 Analyze
  • IBM i2 Analyst Notebook
  • DB2®
  • You will also need to be familiar with:
    • ELP (entities, links, and properties) data model
    • Java™
1

Overview of auditing user activity in the Information Store

The Information Store does not store audit logs automatically. To enable logging, you need to write a class that implements the IauditLogger interface to capture events and store them. This provides the flexibility to store audit logging information in any format and location.

Audit logging is performed by providing implementations of the methods available in the IAuditLogger interface, such as logQuickSearch and logExpand. You can use various methods on the IQuickSearchAuditEvent class, such as getUser() and getClientIPAddress() to obtain details about a given audit event.

With i2, you can see the most active IP addresses, the most active users, and the most common search terms. This provides a direct insight into who is doing what with the Information Store data and where the operations are coming from.

In the following example, the auditing information about a search operation appears in the console.log file of the Liberty server where the i2 Analyze application is deployed.

	@Override
	public void logQuickSearch(final IQuickSearchAuditEvent event)
	{
		final String logdetail =
			"User: " + event.getUser() +
			"IP:" + event.getClientIPAddress() +
			"Method:" + event.getServiceMethodName()
			"Expression: " + event.getExpression() +
			"Timestamp:" + event.getTimestamp() 					
			System.out.println(detail)
	}

When a user performs a quicksearch on the Information Store, the audit log information is generated and log entries similar to those shown below appear in the console.log file.

User:Jenny,IP:213.54.35.213,Method:search,Expression:Brian Venn,Timestamp:2016-08-05T14:14:15.379Z
User:Jenny,IP:213.54.35.213,Method:search,Expression:Louise Anderson,Timestamp:2016-08-05T14:15:15.430Z
User:Jenny,IP:213.54.35.213,Method:search,Expression:Ethan Anderson,Timestamp:2016-08-05T14:18:17.382Z

The developer determines where the audit logging information goes and what information is logged. For example, you can specify that the information is logged to a file, a database, or a WebSphere® MQ queue. For samples of different logging mechanisms, see the logging examples in the IBM i2 Analyze Developer Essentials (under "Related Topics" below).

2

Form an ELP data model from the audit logs

While the logging of data to files or databases is useful, being able to spot patterns or potentially fraudulent activity within the audit data can be difficult, especially if the audit logs contain hundreds or even thousands of audit log entries.

What would be more useful would be to turn the audit information into an ELP model that can be ingested into another Information Store so that analysis can be performed on it. (You use a separate Information Store because different people, such as supervisors or auditors, will use the audit data.)

In the paragraphs that follow, I provide a walkthrough and a sample of an Information Store audit logger, and I explain how you can implement the logger to generate information for analysis in IBM i2 Analyst's Notebook.

First, I develop a new schema using the Analysis Repository Schema Designer. The schema contains five entity types and a single link type, as shown below.

Chart showing schema
Chart showing schema

For a copy of this sample schema, you can download the AuditSample.zip file. (See "Get the code" above.)

3

Create the database connection

As mentioned earlier, the implementation of what the audit logger does with the audit information is up to the user. Here, I take the audit log events and use them to populate the staging tables I will create later. The staging tables are used to hold the data before it is ingested into the Information Store.

To begin, I define the connection to the database where the staging tables are in the Liberty server configuration files, as seen in the following steps:

  1. In a text editor, open the server.datasources.xml file found in the <LibertyHome>/servers/awc folder of the i2 Analyze deployment.
  2. Within the <server> element, add the following lines by changing the database connection details as required for the machine where the staging tables will be located:
    <dataSource jdbcDriverRef="db2_db2jcc4_jar" id="logger" jndiName="ds/Logger">
    	<properties.db2.jcc password="XXX" databaseName="ISTORE" serverName="YYY" user="ZZZ" 	portNumber="50000"/>   
    </dataSource>
  3. Save and close the file.
  4. In order to use this connection in the logger, add the following piece of code to the logger implementation:
                InitialContext ctx = new javax.naming.InitialContext();
                DataSource ds = (DataSource) ctx.lookup("ds/Logger");
                con = ds.getConnection();

    This will use the database connection details specified in the server.datasources.xml. See the code in the DVWorksAuditLoggerSample.jar file, contained in AuditSample.zip.

Now that the database connection is configured, you can write audit log records to the staging tables.

4

Create the staging tables

You use staging tables to store the data before it is ingested into the Information Store to make it analyst ready. The createInformationStoreStagingTable command is run from the i2 Analyze toolkit folder located in <i2InstallationDir>\toolkit\scripts.

Here is an example of the command:

call setup -t createInformationStoreStagingTable -p schemaTypeId=ET2 -p databaseSchemaName=AUDIT -p tableName=SEARCH

This command creates a staging table called AUDIT.SEARCH. This table will form one of the targets of the audit logger. When a search audit event is logged, an entry is placed in this table.

The AuditSample.zip file contains a script you can use to generate a set of staging tables associated with the supplied schema for this tutorial.

5

Log to the staging tables

Audit log events are generated when users do search or expand operations against the Information Store. In this tutorial, we want to log this data in an ELP model.

When a search or expand operation is performed, the following needs to be created in the ELP model:

  • User entity: The user that performed the operation
  • IPAddress entity: IP address where the operation originated from
  • Search/Expand entity: The operation itself (search or expand)
  • Datastore entity: The datastore that the operation was performed against
  • Links:Links between the entities

As seen in IBM i2 Analyst's Notebook, a single search operation when expanded looks like this:

Diagram showing a single search operation
Diagram showing a single search operation

In order to populate the staging table with the details for an entity, the various access methods are used to extract the details about the audit event. They are then used in a Prepared Statement to populate the staging table.

In the following example, a user entity is created in the AUDIT.USER staging table:

	String user = auditEvent.getUser();
	
	PreparedStatement insertUser = con.prepareStatement("INSERT INTO AUDIT.USER (SOURCE_ID,P_USERNAME,P_USER_SECURITY_GROUPS) values(?,?,?)");
                insertUser.setString(1, user);
                insertUser.setString(2, user);
                insertUser.setString(3, auditEvent.getUserSecurityGroups().toString());
      
	insertUser.executeUpdate();

You use the same process to populate the staging tables for other entities. See the supplied example for further details on how the other entities are populated.

In the ELP model, an identifier is required to uniquely identify an entity or a link. In this example, we use the following data to identify an entity:

User: Username

IPAddress: IP address

Datastore: Datastore Name

Search/Expand: UUID string

When the search/expand entity is created in the staging table, a UUID is created and used as the unique identifier. This identifier is then used to link the search/expand entity to the other entities, as shown in the following example:

	uniqueId = UUID.randomUUID().toString();

            PreparedStatement insertSearch = con.prepareStatement("INSERT INTO AUDIT.EXPAND (SOURCE_ID,P_USER,P_CLIENT_IP_ADDRESS...
	...
	...
	 insertSearch.setString(1, uniqueId);
	...

This string is then used by the logger to create the links between the search/expand entity, and the other entities. An example is shown below, where the link between a search entity and a user entity is entered into a staging table.

private void insertSearchLinkToUserIntoDatabase(IQuickSearchAuditEvent auditEvent, String searchId)
    {
	linkUniqueId = UUID.randomUUID().toString();
        username = auditEvent.getUser();
        linkTimeStamp = auditEvent.getTimestamp().toString();
   
	PreparedStatement insertLink = con.prepareStatement("INSERT INTO AUDIT.USERTOSEARCHLINK (SOURCE_ID,FROM_SOURCE_ID,TO_SOURCE_ID,P_TIMESTAMP)" +
                "values(?,?,?,?)");
            
            insertLink.setString(1, linkUniqueId);
            insertLink.setString(2, username);
            insertLink.setString(3, searchId);
            insertLink.setString(4, linkTimeStamp);
            insertLink.executeUpdate();  
}

See DVWorksAuditLoggerSample.jar in the AuditSample.zip file for the code for creating the links to the other entities.

6

Ingest the data

Once the raw data is in the staging tables, it must be ingested to the Information Store. You perform the ingestion by running the ingestInformationStoreRecords command from the <i2Installation>Dir>\toolkit\scripts folder.

The ingestion process requires an ingestion mapping file. The file contains details required by the ingestion pipeline to ingest the data into the information store. For more information regarding ingestion mapping files, see the IBM i2 Analyze Information Store Data Ingestion Guide (under "Related topics" below).

You can find a set of JSON files for use with this example in AuditSample.zip. The example below shows the ingestion pipeline being used to ingest the user entities.

setup -t ingestInformationStoreRecords -p importMappingsFile=C:\IngestionMapping\user.json -p importMappingId=USER

You need to run the ingestion for each entity and link type. AuditSample.zip contains a sample script to perform the ingestion for all the entities and links in this sample. (See "Get the code" above.)

7

Analyze the logging data

Once the data from the logger has been ingested from the staging tables into the Information Store, it is ready to be analyzed.

In the i2 Analyst Notebook, a typical example of a user that has performed a few operations looks like this:

Diagram showing user's operations
Diagram showing user's operations

From this data, you can determine that a user named Jenny has performed operations as follows:

  • Three search operations and one expand operation.
  • These searches against the Infostore database.
  • All operations came from the same IP address.

This shows the potential of using the information generated by the logger in this fashion. You can see clearly from this picture exactly what operations a user has been doing with the data in the Information Store.

8

Use the supplied example and sample data

This tutorial so far has outlined the auditing process. It has shown how you can build a logger to use an ELP data model and how you can then analyze it to see what users are doing. I have provided a sample setup for test purposes (see the AuditSample.zip file). To run the sample:

  1. Download and extract AuditSample.zip. (See "Get the code" above.)
  2. Deploy the i2 Analyze setup with an Information Store and Analysis Repository using any schema you choose. The sample setup also allows you to ingest some sample data into the source system (optional).

    This setup will be the source for the audit logger. This is where audit log records will be generated and used to populate the staging tables on the target system.

    See the i2 Analyze deployment documentation for details on deployment (see "Related topics" below).

  3. Take the DVWorksAuditLoggerSample.jar from the AuditSample.zip file and place it in the <i2DeployLocation>\wlp\usr\servers\awc\apps\awc.war\WEB-INF\lib folder in the source i2 Analyze deployment.
  4. In the <i2DeployLocation>\wlp\usr\servers\awc\apps\awc.war\WEB-INF\lib folder of the sourcei2 Analyze deployment, open the file DiscoServerSettingsMandatory.properties and update the last line to read:
    # The full class name of the audit logger. If no auditing is required, specify
    # com.i2group.disco.audit.NoOpAuditLogger
    AuditLogger=com.example.audit.DVWorksAuditLoggerSample
  5. Open the <DeployLocation>\wlp\usr\servers\awc\server.datasources.xml file. Within the <server> element, add the lines:
    <dataSource jdbcDriverRef="db2_db2jcc4_jar" id="logger" jndiName="ds/Logger">
    	<properties.db2.jcc password="XXX" databaseName="ISTORE" serverName="YYY" user="ZZZ" 	portNumber="50000"/>   
    </dataSource>
  6. Update the database parameters accordingly with the details for the Information Store on the audit target setup.

Set up the audit target

  1. Deploy a separate i2 Analyze setup with an Information Store and Analysis Repository using the schema files supplied in the schemas folder of the AuditSample.zip file. This setup will be the target for the audit logger. Refer to the i2 Analyze deployment guide for details.
  2. Take the generateStagingTables.cmd script from AuditSample.zip and place it in the <i2AnalyzeInstallDir>/toolkit/scripts folder.
  3. Run the script. This will generate the staging tables that are the target for the logger on the source i2 Analyze setup.
  4. In either a DB2 Command window or Data Studio, connect to the ISTORE database and run the following command:
    insert into IS_DATA.INGESTION_SOURCES (SOURCE_ID,SOURCE_NAME,SOURCE_DESCRIPTION,RETRIEVAL_BLOCK_TEMPLATE) values ('1','BRIAN','TEST DATA SOURCE','null');

    The ingestion pipeline requires this ingestion sources table entry. Refer to the IBM i2 Analyze Information Store Data Ingestion Guide for further details (under "Related topics" below).
  5. Locate the JSON folder in the AuditSample.zip file. This contains the mapping files required by the ingestion pipeline.
  6. Open the ingest.cmd file found in the AuditSample.zip file. Update <MAPPINGFILESLOCATION> to the location where the AuditSample.zip file was extracted, with the full path to the JSON file. For example:
    call setup -t ingestInformationStoreRecords -p importMappingsFile=C:\AuditDemo\JSON\user.json -p importMappingId=USER -p verbose=TRUE

    Copy the ingest.cmd file to the <i2AnalyzeInstallDir>/toolkit/scripts folder.

Run the sample

  1. Start the Liberty servers on both the source and target deployments. Check the console.log file on startup for any errors.
  2. On the source deployment, start i2 Analyst's Notebook and connect to the Information Store.
  3. Perform a few search/expand operations against the Information Store.
  4. On the target D, in DB2, examine the staging tables for AUDIT.USER and AUDIT.COMPUTER. They should now contain some entries. If not, check the console.log file on the source i2 Analyze deployment for errors.
  5. Run the ingest.cmd file updated in the previous section, and check the output for errors.
  6. On the target system, start i2 Analyst's Notebook and connect to the Information Store.
  7. Perform a few search/expand operations against the Information Store with search terms based on the terms entered in step 3.
  8. The searches will return results like those shown in the following figure. Screen capture showing results
    Screen capture showing results
  9. Try adding these to the chart and perform expand operations.

Ingest sample data (optional)

The AuditSample.zip file contains some example data that can be ingested into the target setup to give a pre-populated sample of audit data. To use this data, run the following commands on the target system:

  1. Open the loadDemoData.cmd script from the AuditSample.zip and update the <DATALOCATION> with the locations of where the AuditSample.zip was extracted to:
    db2 CALL SYSPROC.ADMIN_CMD( 'IMPORT FROM "<DATALOCATION>\computer.csv" OF DEL MESSAGES ON SERVER INSERT INTO AUDIT.COMPUTER' );

    For example:
    db2 CALL SYSPROC.ADMIN_CMD( 'IMPORT FROM "C:\AuditDemo\computer.csv" OF DEL MESSAGES ON SERVER INSERT INTO AUDIT.COMPUTER' );
  2. Copy the loadDemoData.cmd file to <i2AnalyzeInstallDir>/toolkit/scripts.
  3. In a DB2 command window, run the loadDemoData.cmd script.
  4. Check the output for errors. This should load all the CSV data into the staging tables, and the DB2Load operation should produce output like the following code snippet.
    Result set 1
    --------------
    ROWS_READ            ROWS_SKIPPED         ROWS_INSERTED        ROWS_UPDATED         ROWS_REJECTED        ROWS_COMMITTED
    -------------------- -------------------- --------------- -------------------- -------------------- --------------
              135              0                  135                   0                   0                   135
    
    1 record(s) selected.
    Return Status = 0
  5. Run ingest.cmd as detailed in the previous section.

The sample data is now ready to be analyzed. You can do this via i2 Analyst's Notebook. For example, searching for the user Edward, adding him to a chart, and expanding him shows this:

Chart showing user Edward's searches
Chart showing user Edward's searches

You can see that Edward has been doing numerous searches, some at very late hours, and from various different IP addresses. This might indicate that Edward is up to no good.

Expanding this whole sample data set on a chart looks like this.

Spotting useful information may be difficult here due to the large number of entities and links, so running some i2 Analyst's Notebook social network analysis might yield some interesting results.

An example is shown below, where a Betweenness calculation is performed on the audit data.

Chart showing betweenness calculation
Chart showing betweenness calculation

From this Betweenness calculation, you see IP addresses, the most active users, and the most common search terms; this provides a direct insight into who is doing what with the Information Store data and where the operations are coming from.

Conclusion

This tutorial has provided an overview of the audit logging capabilities of i2 Analyze with the Information Store. I've show how you can use these capabilities to provide analysis-ready audit log information. By using this tutorial and the sample resources available at the IBM i2 Analyze Developer Essentials site, you can implement a tailored audit logging system into any i2 Analyze Information Store.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Security, Big data and analytics
ArticleID=1040196
ArticleTitle=Analyze the analysts with IBM i2 Analyze
publish-date=11222016