Interpreting data outliers in the investigation dashboard
Guardium provides a convenient graphical interface for identifying and responding to outliers detected by the algorithm.
Quick Search must be enabled (grdapi enable_quick_search) to see outlier detection data in the investigation dashboard.
Let's say you have an outlier that shows an exceptional number of errors for user X. Some of the
points you want to investigate are:
- Look at the history, is this the first time the user has outliers? The first time the user has this type of outlier?
- Comparing this user to other users, is this error type unique for this user?
- Check the error types
- Is the number of errors standard for this user?
- Look at the sensitivity of the tables accessed by the user
- Compare the actions to other DBs
Let's walk through a flow of investigating an outlier.
- Open the investigation dashboard by selecting Data or from the User Interface drop-down, and clicking Enter; or by entering quick search in the search field and clicking Search for Data Activity, and add the Activity Chart ( ). (You can change the time interval of the charts at the top of the window.) Red indicators reflect highly anomalous events requiring immediate attention. Yellow indicators represent less extreme anomalies that warrant attention as part of other or related investigations.
- Hover over the outlier icon to view further details in a popup. Here you can click Show details to filter the Results Table to activities or outliers that occurred during the same time period.
- Click the outlier to open the Summary tab of the Outlier View, which shows the number of sources
that had outliers during the selected time period, and the high and medium outliers.
- Filter the data in the table either using the facets list, an individual search result, or the right-click menu.
- Use the right-click menu in the outliers table to show related activity, show related exceptions, show related violations.
- Try monitoring only privileged users to eliminate the data and improve focus.
- You can get good insights into the patterns and usage of the privileged user activity. You might see:
- Users that should NOT be accessing certain data.
- SQL activity that looks abnormal, which could be privileged users that are disguising their activity with SQL attacks. See also Characteristics of an SQL injection attack.
- Look for Time Of Day Outliers
- You can get good insights into the patterns and usage of the privileged user activity. You might see:
- Try monitoring only sensitive objects to eliminate the data and improve focus.
- You can get good insights into the patterns and usage of the users that access these sensitive objects. You might see unusual patterns of access to these objects.
- Look for Time Of Day Outliers.
- Look at which utilities (source programs) accessed these objects.
The Outliers tab in the Results Table has two views:
- Summary has one row per source per hour in which an outlier was found, with an anomaly score and one or more reasons. Note that not every outlier presented in the Summary Tab has further details in the Details tab.
- Details is a sample of events that occurred, with one row per event with a reason (except diverse, see table) and other details (source program, object, verb, etc.). For example, for high volume, the sampling presents the events with the highest score. You can configure the number of samples (rows) that appear in the Details Tab, per each outlier in the Summary tab.
This table describes the columns in both the Summary and Details views:
Column name | Description | Further Action |
---|---|---|
Anomaly Score | Summary Tab: A calculated aggregate value based on the volume of outliers, the severity of individual events, the predicted volume of outliers for a given time of day, and other factors. For example, on a system that typically identifies 0 outliers at 1am and 5-10 outliers at 1pm during weekdays, the presence of two additional outliers (of 2 outliers at 1am or of 12 outliers at 1pm) is more significant, and weighted more heavily, than the hourly total itself. Details Tab: The anomaly score is only relevant for a high volume event. | Right-click the score to open a menu with additional actions you can perform. In the Details tab the score can be 0, indicating that the individual events are not suspicious on their own, but the accumulated events in that hour are suspicious. |
Textual Description | Description of the outlier activity; may include, for example, database name, user name, object. | |
High volume Outlier | True or False. High volume of activities of some type, for example on an object, of a DB user. | |
Vulnerable obj. Outlier | High volume anomaly in the activities on objects that are members of the “Temporary objects” group. | |
New Outlier | True or False. Unusual volume of object/verb activities that are not normal when compared to previous activity. For example an admin uncharacteristically creates a high number of new tables; or a user selects a number of objects and performs updates, when never performed updates earlier | |
Diverse Outlier | Summary view only. True or False. High volume of different types of activities, for example a DB user performs many more activities than usual, or performs them at an unusual time. A sample of the diverse events does appear in the Details tab, they can be identified by the database user. Although Diverse is not a column in the details tab, they may have other reasons assigned to them. Otherwise they appear without a reason. | See the Activity table for more details. |
Error Outlier | True or False. High volume of errors | |
Ongoing Outlier | Summary view only. True or False. Event in the last few hours that was not high enough to create an outlier, but does raise suspicions. | There are no specific events to view. See the Activity table, filter by the database in the facet list, at the time of the suspicious behavior. |
Sensitive Object | ||
Number of Instances | Details view only. Number of times this particular event has been seen in the hour. | |
Records affected | Number of records affected by the particular event. Appears as a negative number if the event does not, by definition, have affected records. |