Data Mart
A Data Mart is a subset of a Data Warehouse. A Data Warehouse aggregates and organizes the data in a generic fashion that can be used later for analysis and reports. A Data Mart begins with user-defined data analysis and emphasizes meeting the specific demands of the user in terms of content, presentation, and ease-of-use.
Use this feature to:
Define and generate a Data Mart.
Aggregate summarized and analyzed data from all units to enable high-level/ corporate view in a reasonable response time.
Improve performance of online reports on Guardium Aggregators.
Provide interactive analysis capabilities for finding patterns, trends, and outliers.
Enable collapsing and expanding levels of data
A Data Mart is practical and efficient for all the Guardium predefined-reports. It prepares the data in advance to avoid overload, full scans, and poor performance.
The Data Mart Configuration icon is available from any Predefined Report.
Highlights of benefits:
Provide Guardium Analytic capability that supports full lifecycle of data analysis.
The analytic process starts from the Query Builder and Pivot Table Builder where the users define their data analysis needs and then “Set As Data Mart”.
The Data Mart extraction program runs in a batch according to the specified schedule. It summarizes the data to hours, days, weeks, or months according to the granularity requested and then it saves the results in a new table in Guardium Analytic database.
The data is then accessible to the users via the standard Reports and Audit Process utilities, likewise any other traditional Domain/ Entity. The Data Mart extraction data is available under the DM domain and the Entity name is set according to the new table name specified for the data mart data. Using the standard Query Builder and Report Builder, users can clone the default query and edit the Query and report, generate Portlet and add to a Pane.
The summarization of data shrinks the data volume significantly. It eliminates joins of many tables by storing the data analysis in un-normalized and pre-calculated table.
The corporate view is supported by using the standard Aggregation utility for the new Guardium Analytic tables. If there is a huge amount of detailed row data at the higher levels of the Aggregation Hierarchy, the Selective Aggregation feature, that enables aggregation of specific module(s), can be configured to aggregate analytic data only.
The Data Mart builder is accessible via Query builder, Report Results, and Pivot-Table view.
Select the Set As Data Mart icon. The button is available only after Saving.
Access to the screen is enabled for users with Data Mart Building permission (User Role Permission). Display the Set As Data Mart new button only for users with the appropriate permission.
Data Mart persistency - changes to the original Query, Report, or Pivot Table do not affect the Data Mart; A snapshot of the originated analysis definition is saved together with the Data Mart upon creation.
If the Data Mart is based on Pivot Table, then the extraction process does not calculate the Total line (sum of columns) and Percent Of Column is not supported.
In addition to the Data Mart definition, the following are created by the Data Mart Definition process:
New Domain and Entity
Default Query
Default Report and portlet
New Data Mart table in the “DATAMART” new database to store the extracted data
- Data Mart – Query and Report Builder
The Data Mart definition process creates new Domain, Entity, default Query and Report. The default Query and Report is accessible via the Report Building menu.
Clicking Data Mart opens the Query Finder GUI; The Query, Report, and Entity fields filter only Data Mart domains (domain name starts with - DatamartDefinition.DOMAIN_PREFIX).
Report Builder GUI: The default Data Marts' reports and all other reports that are related to Data Marts domains are available in the Report Builder GUI.
Follow these steps:
As an Admin user, select Data Mart icon
.
Select New to create a new Data Mart or select from the list of previously created Data Marts.
Complete the fields asking for Data Mart name and Table name (Default is DM). Specify a time granularity and select an initial start time from the calendar icon. Description is optional.
Use the Scheduler to schedule when to run this feature (Run Once Now).
Use the Roles section to restrict Data Mart only to users with the appropriate permission.
Save the configuration.
Note: Changes to the originated query/report do not affect the existing Data Mart.Note: When a data mart extraction runs (Scheduled or Run once now) for the first time, it extracts data from Initial start date to the current time based on the Time granularity. It saves the next period from in the DM_EXTRACTION_STATE table. On the next run, it extracts data starting from the next period from. If a data mart extraction is sought earlier than next period from, then the data mart extraction will show as empty, because the extraction has already processed that time period. In order to extract data earlier than next period from, restore the old data and then run data mart again.
- Central Management and Data Mart
In a Central Management environment, the configuration is distributed automatically to the managed units.
The extraction schedule can be over-ridden on a Managed Unit.
In case of multiple Central Managers, the Data Mart definition can be cloned by using the Export/Import capability.
Add Data Mart Extraction schedule to the Central Manager Distribution screen.
Datamart extraction
Data extracted:
- Export of: Exception Log - details the Exceptions / Errors captured by Guardium. The log will includes exception/error description, user name, source address, DB protocol and more.
- Export of: Session Log - Includes details about datasources’ sessions (login to logout). The log includes session start and end timestamps, OS and DB user of the session, source program and more.
- Export of: Session Log Ended - Session may extend for long period. The extraction works hourly. This log sends the sessions that ended later than the hour started.
- Export of: Access Log - Includes details of the connection information and the activity summary per hour. The log includes the OS and DB user, successful and failed SQLs, client and server IP and more.
- Export of: Full SQL - this log includes the executed SQL details. The log includes full SQL, records affected, session ID and more.
- Export of: Outliers List - this log includes the outliers. The log includes server IP, DB user, Outlier type, DB and more.
- Export of: Outliers Summary - this log includes an hour summary of outliers. The log includes server IP, DB user, DB and more.
- Export of: Group Members - Includes a log of all groups members. The log includes Group type, Group description, Group member and Tuple Flag.
- Export of: Export Extraction Log – Includes log of data relevant to all export or copy files having a name starting with “Export:”.
- Export of: Policy Violations – A policy violation is logged each time that a policy rule is triggered. This log includes the details about the logged violations, such as DB User, Source Program, Access Rule Description, Full SQL String and more.
- Export of: Buff Usage Monitor - Provides an extensive set of sniffer buffer usage statistics
- Export of: VA Results - Provides VA Results
- Export of: Policy Violations - Detailed – the same as Export Extraction Log, but has Object/Verb tuples. It is recommended that only one of them has to be used.
- Export of: Access Log - Detailed – the same as Access Log, but also has the following fields from Application Event entity: Event User Name, Event Type, Event Value Str, Event Value Num, Event Date. It is recommended that Access Log or Access Log – Detailed should be used and not the both of them.
- Export of: Discovered Instances - Provides the result of S-TAP Discovery application, which discovers database instances
- Export of: Databases Discovered –
- Export of: Classifier Results
- Export of: Datasources
- Export of: S-TAP status
- Export of: Installed Patches
- Export of: System Info
- Export of: User – Role
- Export:Classification Process Log
- Export:Outliers List - enhanced
- Export:Outliers Summary by hour - enhanced
Datamart Name | Report Title | Unit Type | Datamart ID |
---|---|---|---|
Export:Access Log | Export: Access Log | Collector | 22 |
Export:Session Log | Export: Session Log | Collector | 23 |
Export:Session Log Ended | Export: Session Log | Collector | 24 |
Export:Exception Log | Export: Exception Log | Any | 25 |
Export:Full SQL | Export: Full SQL | Collector | 26 |
Export:Outliers List | Analytic Outliers List | Any | 27 |
Export:Outliers Summary by hour | Analytic Outliers Summary | ||
By Date | Any | 28 | |
Export:Export Extraction Log | User Defined Extraction Log | Any | 31 |
Export:Group Members | Export:Group Members | Any | 29 |
Export:Policy Violations | Export:Policy Violations | Collector | 32 |
Export:Buff Usage Monitor | Buff Usage Monitor | Any | 33 |
Export:VA Results | Security Assessment Export | Any | 34 |
Export:Policy Violations - Detailed | Export:Policy Violations | Collector | 38 |
Export:Access Log - Detailed | Export: Access Log | Collector | 39 |
Export:Discovered Instances | Discovered Instances | Any | 40 |
Export:Databases Discovered | Databases Discovered | Any | 41 |
Export:Classifier Results | Classifier Results | Any | 42 |
Export:Datasources | Data-Sources | Central Manager, | |
Standalone | 43 | ||
Export:STAP Status | S-TAP Status Monitor | Collector | 44 |
Export:Installed Patches | Installed Patches | Any | 45 |
Export:System Info | Installed Patches | Any | 46 |
Export:User - Role | User - Role | Central Manager, | |
Standalone | 47 | ||
Export:Classification Process Log | Classification Process Log | Any | 48 |
Export:Outliers List - enhanced | Analytic Outliers List - enhanced | Any | 49 |
Export:Outliers Summary by hour - enhanced | Analytic Outliers Summary by Date - enhanced | Any | 50 |
- Issue Summary
The DataMart mechanism exports Guardium sniff data periodically based on the Query defined.
Output files may be written into external machine on demand (configurable).
Extracted Files are compressed.
Extraction may be scheduled (default is hourly).
Extracted Files prefix is Global_ID and short host name of source Machine.
Extracted Files may include column headings (attributes’ descriptions).
- How to Use
All the examples shown below are for “Export:Exception Log” DataMart, for other extraction, change to one of the following:
"Export:Access Log"
"Export:Session Log"
"Export:Session Log Ended"
"Export:Exception Log"
"Export:Full SQL"
"Export:Outliers List"
"Export:Outliers Summary by hour"
"Export:Group Members"
"Export:Export Extraction Log"
"Export:PolicyViolations"
"Export:Buff Usage Monitor"
"Export:VA Results"
"Export:Classifier Results"
"Export:Databases Discovered"
"Export:Access Log - Detailed"
"Export:Discovered Instances"
"Export:Datasources"
"Export:STAP Status"
"Export:Installed Patches"
"Export:System Info"
"Export:User – Role”
"Export:Classification Process Log"
"Export:Outliers List - enhanced"
"Export:Outliers Summary by hour - enhanced"
The export extractions are pre-defined in the system (via the Datamarts mechanism) and disabled by default. In order to enable the export extractions (all or specific) you need to schedule the DataMarts via the grdapi as shown below. You can also use GUI for that.
Schedule Job for DataMart extraction:
grdapi schedule_job jobType=dataMartExtraction cronString=”0 1 0/1 ? * 1,2,3,4,5,6,7” objectName=”Export:Exception Log” startTime=”YYYY-MM-DD HH:MM:SS”
Note that startTime is used to set future start if needed and can be removed if you want to start the DataMart immediately.
In order to delete specific export extractions you can run the following
Delete scheduled Job:
grdapi delete_schedule deleteJob="true" jobGroup="DataMartExtractionJobGroup” obname="DataMartExtractionJob_25”
jobname | job description/ objectName |
---|---|
DataMartExtractionJob_22 | Export:Access Log |
DataMartExtractionJob_23 | Export:Session Log |
DataMartExtractionJob_24 | Export:Session Log Ended |
DataMartExtractionJob_25 | Export:Exception Log |
DataMartExtractionJob_26 | Export:Full SQL |
DataMartExtractionJob_27 | Export:Outliers List |
DataMartExtractionJob_28 | Export:Outliers Summary by hour |
DataMartExtractionJob_29 | Export:Group Members |
DataMartExtractionJob_31 | Export:Export Extraction Log |
DataMartExtractionJob_32 | Export:Policy Violations |
DataMartExtractionJob_33 | Export:Buff Usage Monitor |
DataMartExtractionJob_34 | Export:VA Results |
DataMartExtractionJob_38 | Export:Policy Violations – Detailed |
DataMartExtractionJob_39 | Export:Access Log – Detailed |
DataMartExtractionJob_40 | Export:Discovered Instances |
DataMartExtractionJob_41 | Export:Databases Discovered |
DataMartExtractionJob_42 | Export:Classifier Results |
DataMartExtractionJob_43 | Export:Datasources |
DataMartExtractionJob_44 | Export:STAP Status |
DataMartExtractionJob_45 | Export:Installed Patches |
DataMartExtractionJob_46 | Export:System Info |
DataMartExtractionJob_47 | Export:User - Role |
DataMartExtractionJob_48 | Export:Classification Process Log |
DataMartExtractionJob_49 | Export:Outliers List - enhanced |
DataMartExtractionJob_50 | Export:Outliers Summary by hour - enhanced |
You may enable or disable the extraction by using the following command:
Set DataMart active:
grdapi datamart_set_active Name=”Export:Exception Log”
Set DataMart inactive:
grdapi datamart_set_inactive Name=”Export:Exception Log”
Include header in DataMart extraction:
You can determine whether to include the header line (column names) in the output CSV file via the following grdapi:
grdapi datamart_include_file_header Name=” Export:Exception Log” includeFileHeader=”Yes”
Set target host details:
In order to set target host for the export extraction you need to set the machine host, path and the credential via the following grdapi:
grdapi datamart_update_copy_file_info destinationHost=”Machine_Host” destinationPassword=”********” destinationPath=”/where/to/store/” destinationUser=”user” Name=”Export:Exception Log” transferMethod=”SCP” withCOMPLETEfile=false
withCOMPLETEfile parameter is optional. The default value is true. If set to true then COMPLETE file is sent after a data file is successfully transferred. See “COMPLETE file” section for details.
During the execution of this command, a dummy file is sent to a target machine to validate the connection details. You can also use datamart_validate_copy_file_info grdapi for that.
Validate a connection to a target host:
In order to validate a connection to a target host please following grdapi:
grdapi datamart_validate_copy_file_info destinationHost=”Machine_Host” destinationPassword=”********” destinationPath=”/where/to/store/” destinationUser=”user” transferMethod=”SCP”
You can track the extraction log via the pre-defined “Datamart Extraction Log” report. The report is available via the Report Builder screen; you can add it to a pane.
Enter the customize option in “Datamart Extraction Log” report and Define the following:
- Enter Value for Name LIKE : %
- Enter Period From >= : YYYY-MM-DD HH:MM:SS (input a date in the past)
- Enter Value for Status Like : %
Click update and DataMart Extraction Log report will be active – shows you latest extractions.
Scheduler recommended start time
Outliers DataMarts should be scheduled around 10 minutes past an hour, because before that time data is not ready yet – outliers processing starts on the top of an hour.
It makes sense to schedule Access Log, Exception Log, Full Sql and Session Log/Ended with some time gaps. To get the consistent data by each run Session Log/ Ended must be scheduled as the last ones.
Our recommendation
Job description | Recommended cronString | Every hour on: |
---|---|---|
Export:Access Log | 0 40 0/1 ? * 1,2,3,4,5,6,7 | 00:40 |
Export:Session Log | 0 45 0/1 ? * 1,2,3,4,5,6,7 | 00:45 |
Export:Session Log Ended | 0 46 0/1 ? * 1,2,3,4,5,6,7 | 00:46 |
Export:Exception Log | 0 25 0/1 ? * 1,2,3,4,5,6,7 | 00:25 |
Export:Full SQL | 0 30 0/1 ? * 1,2,3,4,5,6,7 | 00:30 |
Export:Outliers List | 0 10 0/1 ? * 1,2,3,4,5,6,7 | 00:10 |
Export:Outliers Summary by hour | 0 10 0/1 ? * 1,2,3,4,5,6,7 | 00:10 |
Export:Export Extraction Log | 0 50 0/1 ? * 1,2,3,4,5,6,7 | 00:50 |
Export:Group Members | 0 15 0/1 ? * 1,2,3,4,5,6,7 | 00:15 |
Export:Policy Violations | 0 5 0/1 ? * 1,2,3,4,5,6,7 | 00:05 |
Export:Buff Usage Monitor | 0 12 0/1 ? * 1,2,3,4,5,6,7 | 00:12 |
Export:VA Results | 0 0 2 ? * 1,2,3,4,5,6,7 | Daily at 2 AM |
Export:Policy Violations - Detailed | 0 5 0/1 ? * 1,2,3,4,5,6,7 | 00:05 |
Export:Access Log - Detailed | 0 40 0/1 ? * 1,2,3,4,5,6,7 | 00:40 |
Export:Discovered Instances | 0 20 0/1? * 1,2,3,4,5,6,7 | 00:20 |
Export:Databases Discovered | 0 20 0/1? * 1,2,3,4,5,6,7 | 00:20 |
Export:Classifier Results | 0 20 0/1? * 1,2,3,4,5,6,7 | 00:20 |
Export:Datasources | 0 0 7 ? * 1,2,3,4,5,6,7 | Daily at 7 AM |
Export:STAP Status | 0 0/5 0/1 ? * 1,2,3,4,5,6,7 | Every 5 minutes |
Export:Installed Patches | 0 0 5 ? * 1,2,3,4,5,6,7 | Daily at 5 AM |
Export:System Info | 0 0 5 ? * 1,2,3,4,5,6,7 | Daily at 5 AM |
Export:User - Role | 0 5 0/1 ? * 1,2,3,4,5,6,7 | 00:05 |
Export:Classification Process Log | 0 25 0/1 ? * 1,2,3,4,5,6,7 | 00:25 |
Export:Outliers List - enhanced | 0 10 0/1 ? * 1,2,3,4,5,6,7 | 00:10 |
Export:Outliers Summary by hour - enhanced | 0 10 0/1 ? * 1,2,3,4,5,6,7 | 00:10 |
Purge /var/exportdir
If a file transfer is failed for any reason, for example, target machine is down, then it retries a transfer on the next run. The backlog is kept in /var/exportdir directory. Purge Process cleans up the backlog older than 1 day.
COMPLETE file
The empty COMPLETE file is sent to notify an external system that a file is ready.
- For each file, in addition to the file a COMPLETE file is also sent. The COMPLETE file name is [file name]_COMPLETE.gz
1762144738_gibm32_EXP_SESSION_LOG_20151028230000_COMPLETE.gz
- The process is synchronous - for example, first, the file (the SESSION LOG file) is generated, then it is copied and only when it has finished copying then the COMPLETE file is generated and copied.
- COMPLETE file is send even if there is no data to send.
Change datamart initial start:
In order to change datamart initial start time, please use update_datamart grdapi.
grdapi update_datamart Name="Export:Session Log" initial_start=[initial start value]
For example,
Set initial start to current time
grdapi update_datamart Name="Export:Session Log" initial_start=< >
Set initial start to Oct 1, 2016
grdapi update_datamart Name="Export:Session Log" initial_start="2016-10-01 00:00:00"
Copy file bundle
It is possible to bundle a number of CSV export datamarts together. The bundle has main datamart. Each datamart included in a bundle pull out data based on their own scheduling. After the main datamart extracted data, it puts data files from all datamarts included in the bundle in the same tar file and sends it to a destination server. The main datamart has to have the latest scheduling.
For example, the bundle included “Export:Full SQL”, “Export:Exception Log”, “Export:Session Log” and “Export:Session Log Ended” as main datamart.
The recommended scheduling for this bundle:
Job description Recommended cronString Every hour on:
Export:Session Log 0 45 0/1 ? * 1,2,3,4,5,6,7 00:45
Export:Session Log Ended 0 46 0/1 ? * 1,2,3,4,5,6,7 00:46
Export:Exception Log 0 25 0/1 ? * 1,2,3,4,5,6,7 00:25
Export:Full SQL 0 30 0/1 ? * 1,2,3,4,5,6,7 00:30
Create a bundle:
grdapi datamart_copy_file_bundle action="create" bundle_name=[bundle name] main_datamart_name=[bundle main datamart name]
Include a datamart in a bundle:
grdapi datamart_copy_file_bundle action="include" bundle_name=[bundle name] datamart_name=[datamart name]
Exclude a datamart from a bundle:
grdapi datamart_copy_file_bundle action="exclude" bundle_name=[bundle name] datamart_name=[datamart name]
Delete a bundle:
grdapi datamart_copy_file_bundle action="delete" bundle_name=[bundle name]
Get a bundle information:
grdapi datamart_copy_file_bundle action="info" bundle_name=[bundle name]
Example:
grdapi datamart_copy_file_bundle action="create" bundle_name="SFE_BUNDLE" main_datamart_name="Export:Session Log Ended"
grdapi datamart_copy_file_bundle action="include" bundle_name="SFE_BUNDLE" datamart_name="Export:Exception Log"
grdapi datamart_copy_file_bundle action="include" bundle_name="SFE_BUNDLE" datamart_name="Export:Full SQL"
grdapi datamart_copy_file_bundle action="include" bundle_name="SFE_BUNDLE" datamart_name="Export:Session Log"
> grdapi datamart_copy_file_bundle action="info" bundle_name="SFE_BUNDLE" main_datamart_name="Export:Session Log" datamart_name=""
ID=0
=========================================
Bundle Name: SFE_BUNDLE
=========================================
Main Datamart: Export:Session Log Ended
Datamarts:
Export:Full SQL
Export:Exception Log
Export:Session Log
Get_Datamart Info grdapi
get_datamart_info grdapi gets detail datamart information.
get_datamart_info datamart_name=[Datamart Name]
Example,
grdapi get_datamart_info datamart_name="Export:Export Extraction Log"
=========================================
Data Mart Name: Export:Export Extraction Log
=========================================
Description:
Based on Report: User Defined Extraction Log
Based on Query: User Defined Extraction Log
Extract result to: File
Initial Start: 2016-04-18 09:00:00
Creation Date: 2016-12-28 18:01:24
Time Granularity: 1 HOUR
Active: true
---------------------
File Name: EXP_DM_EXTRACTION_LOG
Lines per File: 500000
File Header: "UTC Offset","Name","Period Start","Period End","Run Id","Start Time","End Time","Status","File Status","Records Extracted","Details","Timestamp"
Include File Header: true
-----------------------------------------
Copy File Info
-----------------------------------------
Host Name: host.com
User Name: admin
Directory: /local/incoming/
Transfer Method: SCP
Bundle Name:
Bundle Main Datamart: false
Send COMPLETE File: false
-----------------------------------------
Last Extraction Info
-----------------------------------------
State:1
---------------------
Timestamp: 2017-01-18 14:50:00
Next Period: 2017-01-18 14:00:00
Last Extracted ID: 0
---------------------
Extraction Log
---------------------
Timestamp: 2017-01-18 14:50:02
Extract Status: OK
Start Time: 2017-01-18 14:50:00
End Time: 2017-01-18 14:50:00
Period Start: 2017-01-18 13:00:00
Period End: 2017-01-18 14:00:00
Records Extracted: 26
Details: SCP to: host.com, User: admin, Path: /local/incoming/, File: DMv2_gibm32_EXP_DM_EXTRACTION_LOG_20170118180000.gz
Last for Period: true
File Name: /var/dump/DATAMART/EXP_DM_EXTRACTION_LOG_20170118180000.csv
Bundle Name:
File Transfer Status: Done
Comments
================
Full_SQL DataMart will work only if log full details or log masked details is defined and installed.
Outlier DataMarts will work only if outlier detection is enabled.
If DataMart/s scheduler had been stop for some time and you don’t want the data to be extracted retroactively, then before you reschedule extractions to run again, please set the correct “Initial Start” in the Data Mart Configuration screen.
User Defined DataMarts
User Defined DataMart/s can also be used to transfer data to a destination host. DataMart has to be of type File, the Data Mart Name should starts with “Export:” and File Patch starts with “EXP_”.
Dependencies
================
How to apply Patch:
================
http://www-01.ibm.com/support/knowledgecenter/SSMPHH_8.2.0/com.ibm.guardium.using.doc/topics/how_to_install_patches.html?lang=en
- GuardAPI commands for Data mart
grdapi datamart_copy_file_bundle
function parameters :
action - String - required - Constant values list
bundle_name - String - required
datamart_name - String
main_datamart_name - String
grdapi datamart_include_file_header
function parameters :
includeFileHeader - String - required - Constant values list
Name - String - required
grdapi datamart_set_active
function parameters :
Name - String - required
grdapi datamart_set_inactive
function parameters :
Name - String - required
grdapi datamart_update_copy_file_info
function parameters :
destinationHost - String
destinationPassword - String
destinationPath - String
destinationUser - String
Name - String - required
transferMethod - String - Constant values list
withCOMPLETEfile - Boolean
grdapi datamart_validate_copy_file_info
function parameters :
destinationHost - String - required
destinationPassword - String - required
destinationPath - String - required
destinationUser - String - required
transferMethod - String - Constant values list
grdapi update_datamart
function parameters :
Comment - String
initial_start - Date
Name - String - required
grdapi get_datamart_info
function parameters :
datamart_name - String - required
isExtended - Boolean
grdapi add_dm_to_profile
function parameters:
category - String
cron_string - String
datamart_name - String - required - Constant values list
profile_name - String - required - Constant values list
unit_type - String - Constant values list
api_target_host - String
grdapi remove_dm_from_profile
function parameters:
datamart_name - String - required
profile_name - String - required - Constant values list
api_target_host - String