Data aggregation
Collect and merge information from multiple Guardium® units into a single Guardium Aggregation appliance to offload the reporting and analysis function from the collectors, while also providing a consolidated view of the data from multiple collectors.
Aggregation Process
- Accomplished by exporting data on a daily basis from the collectors to the aggregator (importing daily export files to the aggregator).
- Aggregator then goes over the uploaded files, extracts each file and merges it into the internal repository on the aggregator.
For example, if you are running Guardium in an enterprise deployment, you may have multiple Guardium servers monitoring different environments (different geographic locations or business units, for example). It may be useful to collect all data in a central location to facilitate an enterprise view of database usage. You can accomplish this by exporting data from a number of servers to another server that has been configured (during the initial installation procedures) as an aggregation appliance. In such a deployment, you typically run all reports, assessments, audit processes, and so forth, on the aggregation appliance to achieve a wider view, not always an enterprise view. The Aggregator does not collect data; it presents the data from the collectors.
Aggregation does not summarize or roll-up the data. It merges the records.
See pre-defined aggregation reports at:
and , for example.By default, all static tables on an aggregator are archived daily. Adding the static tables to the normal purge process eliminates the existence of orphans, freeing up disk space and improving report performance.
Archive and export of static tables on an aggregator includes full static data only on the first day of the month (archive) or when the export configuration changes (export). Use the CLI commands store archive_table_by_date [enable | disable] or show archive_table_by_date. Other relevant CLI commands are store aggregator clean orphans or show aggregator clean orphans.
Hierarchical Aggregation
Guardium supports hierarchical aggregation, where multiple aggregators merge upwards to a higher-level, central aggregator. This is useful for multi-level views. For example, you may need to deploy one aggregator for North America aggregating multiple units, and another aggregator for Asia aggregating multiple units, and a central, global aggregator merging the contents of the North America and Asia aggregators into a single corporate view. To consolidate data, all aggregators export data to the global aggregator on a scheduled basis. The global aggregator combines that data into a single database (in the global aggregator), so that reports run on the global aggregator use the consolidated data from all of the lower level aggregators.
Aggregating, Archiving, and Purging Operations
The data is transferred through daily batch files by using SCP. A daily data export is scheduled on the source and a corresponding data import is scheduled on the aggregator. There is an option to use a secondary aggregator in case the primary aggregator is unreachable. On either or both units, archive and purge operations are scheduled to back up and purge data on a regular basis (both to free up space and to speed up access operations on the internal database).
CAS data is also aggregated and archived.
Orphan cleanup on aggregators
When the aggregator includes restored data, orphans cleanup related to the restored data are set to run according to the expiration date set when data was first restored.
If any changes are done through API commands related to the expiration date, this does not affect the date restored data that is available for Orphans cleanup.
For example: The user restores data and wants to keep this data for 7 days. This means the expiration date of this data is 7 days from today and this data is available for orphan cleanup after 7 days.
If the expiration date is changed, for example, set to keep the data for shorter/longer period - it does not affect the date this data is available for orphan cleanup. Pay attention for this especially if you change the expiration period to be longer - in order not to lose data. The the rest of the data on the managed unit is available for orphan cleanup as first designed.
Calculating maximum number of Collectors per Aggregator
When a Guardium system is built from an .ISO, a default value of 10 for the maximum number of collectors per aggregator is set.
When a customer upgrades the Guardium system, the system calculates the maximum number of collectors using the following logic:
- Get number of collectors according to data in internal Guardium table. The default value is 10.
- If results of step 1 is 0 (no collectors are found), the system sets this value to 10.
- If a different number of collectors is found, the system will add 20 percent more to the number determined in step 2.
- For example, if Step 1 did not find any collectors, then Step 2 will set a value of 10, and then Step 3 will add 20% to it and will make it 12.
- Another example, in Step 1 the system found five collectors exporting to an aggregator. In this case, the value is set to 5. Step 2 is not relevant as result was 5 and not 0. Step 3 will add 20% to 5 and will set this value to 6.