Overview

The Guardium universal connector enables GuardiumĀ® Data Protection to get data potentially from the native activity logs of any data source without using S-TAPs. It includes support for various plug-in packages, requiring minimal configuration. You can easily develop plug-ins for other data sources and install them in Guardium.

The captured events embed messages of any type that the configured data source supports. That includes information and administrative system logs (for example, login logs, various data lake platform native plug-in related data), DDLs and DMLs, errors of varying subtypes, and so on. The incoming events that the universal connector receives can be configured to arrive either encrypted or as plain text.
Figure 1. Guardium Universal Connector architecture
Guardium Universal Connector architecture

The Guardium universal connector supports many platforms and connectivity options. It supports pull and push modes, multiprotocols, on-premises, and cloud platforms. For the data sources with predefined plug-ins, you configure Guardium to accept audit logs from the data source.

For data sources that do not have predefined plug-ins, you can customize the filtering and parsing components of audit trails and log formats. The open architecture enables reuse of prebuilt filters and parsers, and creation of shared library for the Guardium community.

The Guardium universal connector identifies and parses the received events, and converts them to a standard Guardium format. The output of the Guardium universal connector is forwarded to the Guardium sniffer on the collector, for policy and auditing enforcements. The Guardium policy, as usual, determines whether the activities are legitimate or not, when to alert, and the auditing level per activity.

The Guardium universal connector is scalable. It provides load-balancing and fail-over mechanisms among a deployment of universal connector instances, that either conform to Guardium Data Protection as a set of Guardium Collectors, or to Guardium Insights as a set of universal connector pods. The load-balancing mechanism distributes the events that are sent from the data source among a collection of universal connector instances that are installed on the Guardium endpoints that is Guardium Data Protection collectors. For more information, see Enabling Load-Balancing and Fail-Over.

Connections to databases that are configured with the Guardium universal connector are handled the same as all other data sources in Guardium. You can apply policies, view reports, monitor connections, for example.

How do universal connectors work?

The universal connector is a Logstash pipeline that is composed of a series of three plug-ins:

  1. Input plug-in. This plug-in loads events. Depending on the type of plug-in, settings are available to either pull events from APIs or receive a push of events.

  2. Filter plug-in. This plug-in filters the events that the input plug-in captures. The filter plug-in parses, filters, and modifies event logs in to a Guardium-digestible format.

  3. Output plug-in. This plug-in receives the formatted event logs from the filter plug-in and transmits them to IBM Guardium (either Guardium Data Protection or Guardium Insights).

Note: The output plug-in is presented here as an internal component of the universal connector pipeline and the user must not access or modify it.
Figure 2. Logstash pipeline
overview pipeline

Enabling audit log collection

Guardium supports several types of plug-ins to enable audit log forwarding from various data sources, which include Guardium-supported databases that are hosted on cloud or on-premises data lake platforms.

  1. Use the preinstalled plug-in packages that require minimal configuration on the client's end by either plugging suited values into their respective template configuration files in the input and filter sections. You can also add a Ruby code subsection to the said filter section in case a more complex parsing method is necessary as a preprocessing stage to be ran before the execution of the respective filter plug-in. Refer to the plug-in readme file for the Available Plug-ins.

Remember:
  1. The predefined and preinstalled plug-ins do not require any manual uploads or other such prerequisites on the user's end, as opposed to custom-made plug-ins, or other available Logstash plug-ins. You can simply use a ready-made template for plugging in values to the input and filter sections of their respective configuration files, expand these sections by using online preinstalled Logstash plug-ins, or write your own Ruby code parser that uses the Ruby filter plug-in as a preprocessing stage before running the filter plug-ins.
  2. You can use one of the input plug-ins already in the repository and modify its config file input section. But if the input plug-ins already in the repository are insufficient for your needs, you can add a new one.
  3. You can choose to configure either pull or push methods through the messaging middleware service that is installed on the data lake platform that is used by the input plug-in. Messages can be received with pull or push delivery. In pull mode, the universal connector instance initiates requests to the remote service to retrieve messages. In push mode, the remote service initiates requests to the universal connector instance to deliver messages.
  4. The specific audit log types that are transmitted into the universal connector from the data source are configurable through the SQL instance settings that are installed on the data lake platform. This can vary depending on the installed data lake platform native plug-ins and the used messaging middleware service.
  5. For some data lake platforms, you can define inclusion and exclusion filters for the events that are routed to the universal connector to be ingested by the input plug-in. This can result in a more efficient filtering that is implemented either as part of the filter scope in the connector's configuration file, or in the developed filter plug-in.

Enabling load balancing and fail-over

When you use the built-in features of both Guardium Data Protection, you might inadvertently distribute the entire set of received events to each Guardium instance, which can result in duplicated and redundant event processing. To prevent this default behavior and help ensure efficient operation, it is recommended to configure these mechanisms as part of the input scope in the configuration file of the installed connector. You can achieve this configuration through both pull (Pub/Sub, JDBC, SQS) and push (Filebeat) methods. When you use the push method in Guardium Data Protection, configuring the entire set of collectors as part of the input scope is necessary. For more information about each plug-in, see Available Plug-ins.

How Universal Connector 2.0 work?

The Universal Connector 2.0 is a Kafka Connect pipeline, consisting of following plug-ins.
  • Source Connector: This plug-in loads events from the data source, normalizes them, and produces them to a Kafka topic.
  • Sink Connector: This plug-in processes the events that the Source Connector captures. It parses, filters, transforms, and enriches the events into a Guardium-compatible format before transmitting them to Guardium Collectors for policy enforcement and analytics.

Universal Connector 2.0 failover and load balancing

For Universal Connector 2.0, create a Kafka cluster for Guardium Data Protection version 12.1 with appliance bundle 12.0p120 or later.
  • To help ensure load balancing and failover, create Kafka nodes with the appropriate number of Kafka nodes in a Kafka cluster that is required for your specific use case. When one node encounters issues, the remaining nodes support with balancing the load.
  • For optimal failover, install each data source profile on at least two collectors. The load balancing mechanism relies on session details, with messages from the same session that are processed on the same collector.
  • If a data source profile is deployed on multiple collectors and all the are unavailable for more than the defined Kafka retention period, there is a possibility of data loss.

Configure the Guardium universal connector end-to-end by using legacy workflow

Quick overview of the steps that are available to configure the Guardium universal connector end-to-end. You must have S-TAP Management Application role permission.

  1. Allocate Guardium collectors to receive the audit logs.
  2. For the data source types supported by Guardium, do the following.
    1. Configure the native audit logs on the data source that Guardium can parse, and then configure the data shipper to forward the audit logs to the Guardium universal connector.
    2. Configure the Guardium universal connector to read the native audit logs. For more information, see Configuring a universal connector topic.
      Note: If you are using secrets or sensitive information in the configuration, see Creating and managing secrets topic before you configure a new connector.
  3. Configure the preinstalled plug-ins. For more information, see Available Plug-ins.
  4. Enable universal connector on the designated Guardium collectors or stand-alone machine. For more information, see Enabling universal connector on collectors topic to enable the Guardium universal connector on collectors.

For more information, see Configuring a universal connector by using the legacy workflow.

Configure the Guardium universal connector end-to-end by using central manager workflow

12.1 and later
  1. Allocate Guardium collectors to receive the audit logs.
  2. For the data source supported by Guardium, configure the native audit logs on the data source that Guardium can parse, and then configure the data shipper to forward the audit logs to the Guardium universal connector.
For more information, see Configuring universal connectors by using a central manager.

Limitations

Note: Limitations that are associated with specific data sources are described in the universal connector plug-in readme files for each data source. For more information, see Supported data sources.
  • When you configure universal connectors, use a new port for each future connection. For Filebeat, use port numbers only higher than 5000.
  • Use only the packages that are supplied by IBM. Do not use extra spaces in the title.
  • IPV6 support
    • S3 SQS and S3 CloudWatch plug-ins are not supported on IPV6 Guardium systems.
    • The DynamoDB plug-in does not support IPV6.
  • Native MySQL plug-in
    • If the database commands are run by using MySQL native client, do not send the database name to Guardium.
    • When connected with this plug-in, queries for nonexistent tables are not logged to GDM_CONSTRUCT.
  • MongoDB plug-ins do not send the client source program to Guardium.
  • Do not configure more than 10 universal connectors on a single Guardium collector.