Privacy notice: IBM Security Guardium Analyzer General Data Protection Regulation (GDPR) compliance

GDPR-relevant data processing categories

To fulfill General Data Protection Regulation (GDPR) requirements, IBM Security Guardium Analyzer must publish its GDPR-relevant data processing categories. This section outlines the main items that could contain GDPR-relevant data, including personal and sensitive personal data, along with their processing categories.

IBMid user information

When logging in and registering for the service, Guardium Analyzer retrieves the user's IBMid user ID (an email address), first name, and last name. The categories of processing that take place on this GDPR-relevant data are read and long-term storage.

Data source definition

When connecting a data source to Guardium Analyzer, you use IBM Security Guardium Data Connector to create a data source definition for the database. Guardium Data Connector is a small tool that is installed locally and that supplies scan results to Guardium Analyzer on the cloud. When defining data sources in Guardium Data Connector, the data source user name, password, and data source information are required so that Guardium Analyzer can perform future scans. The categories of processing that take place on this GDPR-relevant data are read, modification, and long-term storage. Modification of this data will only take place when the user modifies the data source definition in the Guardium Data Connector.

Guardium Data Connector scans

When the data connector scans a data source, queries are run against the data source for the purpose of locating GDPR-relevant data. The only category of processing that takes place on this GDPR-relevant data is read. After the data connector reads the personal and sensitive personal data to determine if it is subject to GDPR, the data is immediately discarded. No GDPR-relevant data is processed in or transmitted to the Guardium Analyzer cloud service. The processing only takes place within the confines of the data connector and data source – and the data is not stored.

Support/Operations processes

GDPR-relevant data handling

All Guardium Analyzer development team members have received GDPR education and training - and they are well-versed on all GDPR data governance policies. All development processes and architectures comply with GDPR guidelines.

When supporting the offering, there may come a time when support engineers and developers will need to view GDPR-relevant data as a part of the troubleshooting process. However, these practices will only happen on an as-needed basis. While reviewing transactions and data, engineers will only view the data specific to the subscriber who reported the issue.

Change requests in GDPR-relevant data processing

This document outlines the GDPR-relevant data and types (or categories) of data that Guardium Analyzer captures, as well as how that data is processed, transmitted, and deleted. GDPR policies will require us to notify users of any changes in the data processing that is performed in the offering. The Guardium Analyzer team will notify all subscribers via email when Guardium Analyzer data processing changes. In addition, the Guardium Analyzer GDPR documentation will be changed to reflect any new details. Subscribers may choose to cancel their subscription if they do not agree with the data processing changes, or if they have requirements that will no longer be met by the changes.

Tenant onboarding

Guardium Analyzer is a multi-tenant software as a service (SaaS) offering. The tenant experience begins with an onboarding process that results in being registered to use the offering. Finally, a dedicated environment is allocated for the tenant.

Tenant/User registration

The onboarding process starts on IBM Marketplace, where Guardium Analyzer and its subscription plans are published. Upon the selection of the desired subscription plan (for example, a free plan), the prospective tenant is asked to provide their IBMid, which is an email address. This IBMid is used to uniquely identify the environment that will be dedicated to the tenant. The tenant's IBMid, first and last name (derived from their IBMid), and subscription plan details (for example, free plan or paying plan) constitute the metadata that Guardium Analyzer keeps about its tenants. This registered tenant can invite other users from their organization to join their tenant environment. This is easily accomplished by providing the IBMid of the other users.

Tenant environment provisioning

The environment that is provisioned for each tenant consists of two key components:

  • The set of tenant metadata. This includes the IBMid list, the first and last name of all users that have been registered under the tenant subscription (and if they are a subscription administrator or a regular user), and the tenant subscription plan details. Each time a tenant downloads a data connector and registers it, the unique identifier, name, description, and internet protocol (IP) address of the data connector are also added to the tenant metadata. During a steady state of operations, this unique identifier allows Guardium Analyzer to know which tenant it is dealing with when a data connector uploads its scan results.
  • A unique set of IBM Cloudant databases dedicated to the tenant. This is where all of the scan results that are sent by the data connector are stored. For each data source scanned by a data connector, the scan results consist of two pieces of information:
    • The names of the tables and columns in which GDPR-relevant data has been found within the data source (note that this is not the actual personal or sensitive personal data - but rather the names of the locations that house it).
    • The list of data source vulnerabilities that were found when the data source was scanned.

    The Guardium Analyzer cloud service uses these two pieces of information to calculate a risk score for each data source, thus giving the tenant insight into the risk across all their data sources (ordered from the data source that may be most likely to be at risk to the data source that may be least likely to be at risk). In addition to the scan results for each data source, Guardium Analyzer also receives and stores the first name, last name, and email address of each data source owner. This information is captured when the tenant defines a new data source in the data connector and is automatically sent to Guardium Analyzer.

Accessing the Guardium Analyzer cloud service

To access Guardium Analyzer, a user must go through two levels of control: authentication and authorization.

Tenant/User identification and authentication (IBMid)

During this step, the user must prove that they are who they claim to be. To accomplish this, Guardium Analyzer redirects the user to IBMid to complete the authentication process. If the credentials are not valid, access is rejected.

Tenant/User authorization (logical data segregation)

If the credentials are valid, the user is redirected back to Guardium Analyzer, where the second level of control, authorization, takes place. During this step, Guardium Analyzer consults its metadata (described above) to ensure that the user is already registered to use the service. If this is not the case, access is rejected. Otherwise, access is granted. From this point on, all insight provided to the tenant user is derived from the specific environment dedicated to that tenant. In other words, all queries are directed to the environment dedicated to that tenant. This logical tenant separation supplements the physical tenant separation at the IBM Cloudant database level and ensures that no tenant is inadvertently given another tenant's insight information.

Guardium Data Connector setup

Data Connector installation

Security guidelines

When planning to install a data connector, it is important to consider the security implications regarding the data connector installation location. Thedata connector is an extension of Guardium Analyzer that users are responsible for installing in their own environment. Because of this, Guardium Analyzer has no control over the installation environment.

When installing the data connector, users should ensure that the network in which the data connector server resides has a sufficient protection. The data connector is meant to be a communication middleman who can reach data sources for scanning, but also communicate with the internet as it needs to be able to reach Guardium Analyzer on the cloud. It is important that levels of caution are exercised to ensure that firewall rules, VLANs, and Active Directory permissions are properly set up so that only those users who need access to the data connector (and the host on which it is installed) have that access.

Data Connector access

To access the data connector, the user must have network (or VPN) access to the server that the data connector is installed on. The user can remotely access the server and use a browser that is on the server - or the user can use a local browser to remotely access the server. The method of access depends on the security setup in the environment.

When connecting to the data connector user interface (this is the primary method of accessing data connector functionality), the service will establish a Transport Layer Security (TLS) connection over HTTPS with a self-signed certificate. This means that any GDPR-relevant data that flows between the browser and the data connector will be encrypted. In addition, the user must be authenticated and authorized to access the data connector user interface. The user authenticates with their IBMid and then is authorized via Guardium Analyzer (they must be a valid user of the subscription that the data connector is registered to).

The Guardium Analyzer cloud portion will never reach out directly to the data connector. For any communication to occur, only the data connector can initiate a connection with Guardium Analyzer.

GDPR-relevant data flow

The diagrams and supporting text in this section describe the flow of GDPR-relevant data in the data connector. This involves the logical and physical flows of data connector registration and logging in to the data connector user interface.

Logical Flow in Data Connector Registration

This diagram shows a conceptual flow of GDPR-relevant data within the offering when the data connector is being registered. When a user first visits the data connector user interface, they are directed to IBMid to log in. The user enters their email on that page, which then allows the transmission of the email address and other authentication information to the Guardium Analyzer offering. Guardium Analyzer then verifies that this user has a subscription, by checking against its IBM Cloudant instances (all of which exhibit disk encryption and segregation). Guardium Analyzer and the data connector then exchange the email address, first name, and last name of the user to make that Guardium Analyzer user the owner of the data connector and to generate unique non-GDPR-relevant data credentials for that data connector to use for all future communications with Guardium Analyzer.

Physical Flow in Data Connector Registration

This diagram is a physical representation of the data connector registration data flow. As mentioned previously, the user must register when first accessing the data connector user interface. This process sends the user to IBMid to log in. When the user enters his or her credentials, these are input to their browser and flow to an IBMid data center. After authentication, IBMid returns authentication information and the IBMid email address to the browser, which in turn will send authentication information and the IBMid email address to a Guardium Analyzer data center. Guardium Analyzer will read from a supporting IBM Cloudant cluster to verify the user information and subscription. It will retrieve the email, first name, and last mame of the user from Cloudant, and it will respond with that information to the data connector server.

Logical Flow in Data Connector User Interface Login

This shows the conceptual flow of GDPR-relevant data when a user logs into the data connector user interface. Similar to registration, the user is redirected in their browser to IBMid to log in. The user's email address and other authentication information are sent to Guardium Analyzer. Guardium Analyzer then checks the user and subscription information in Cloudant. Guardium Analyzer then responds to the data connector with non-GDPR-relevant data (session token and other authentication information).

Physical Flow in Data Connector User Interface Login

This diagram shows a physical representation of the flow of GDPR-relevant data when a user logs in to the data connector user interface. As mentioned, the user is redirected to IBMid, where they enter their credentials into the browser. This data is then transmitted to an IBMid data center, which then responds with the user's email address and other authentication information. The browser then forwards the email address and other authentication information to a Guardium Analyzer data center. Guardium Analyzer, in turn, will read from a Cloudant cluster to retrieve user and subscription information for user validation. Guardium Analyzer then responds to the data connector server with authorization and session information (both non-GDPR-relevant data).

Adding data sources to the data connector

GDPR-relevant data captured from the user

When adding a data source definition to a data connector, the user must enter information for connecting to the data source that will be scanned. This information includes connection information (for example, IP address and ports), instance name, and data source user credentials to for logging in to the data source. This GDPR-relevant data (data source user) are required for scans of the data source.

Storage of the data source definition

When the user adds the data source information in the browser, the GDPR-relevant data is transmitted via encrypted HTTPS. When the data connector processes the GDPR-relevant data, it stores the information in an encrypted data source that is local to the data connector.

GDPR-relevant data flow

Logical Flow in Data Connector Datasource Additions

This diagram illustrates the GDPR-relevant data flow when a user adds a data source to a data connector. A data source definition contains information, such as data source name, data source instance name, connection information, and data source user credentials to connect. After a data source is added to the data connector, the data connector transmits the data source name and the information (email address, first name, and last name) of the Guardium Analyzer user who added the data source to Guardium Analyzer. This results in the user becoming the owner of the data source that scan results will be attributed to. Guardium Analyzer then stores the data source name and owner (email address, first name, and last name) in Cloudant.

Physical Flow in Data Connector Datasource Additions

This diagram is a visualization of the physical flow of GDPR-relevant data when a data source is added to a data connector. The user will add a data source via the data connector user interface. The information is typed into the browser. This information includes the data source name, data source instance name, connection information, and user credentials for connecting to the data source. The GDPR-relevant data is then transmitted to the data connector server. If the browser that is being used to add a data source is on the data connector server, then the transmission will be local. The data connector will then store the data source definition in its local encrypted database. The data connector will transmit the data source name and information about the user who added the data source (email address, first name, and last name) to the Guardium Analyzer datacenter. This user will be the owner of the data source that has been added to the data connector. Guardium Analyzer will then store the data source name and data source owner (email address, first name, and last name) in a Cloudant cluster.

Logging

Guardium Data Connector

Guardium Data Connector includes a log file that tracks errors and warnings related to the agent, scans, and connections to Guardium Analyzer. The file is SecureConnector\Logs\secure-connector.log in your Guardium Data Connector installation directory. The default location of this file is C:\Program Files\IBM\Guardium Data Connector\SecureConnector\Logs\secure-connector.log.

Guardium Analyzer

Due to the nature of cloud or software as a service (SaaS) offerings, logs are not readily available to users. If you require log files for your cloud account, please open a service request (PMR) with your IBM support representative.

Steady state

When a tenant has been registered and has installed one or more data connectors within the confines of their own environment, the tenant has entered into a steady state of operations. This section describes the main aspects of this steady state.

Data source scanning

When the tenant has configured a data connector, the connector will periodically scan the desired data sources according to a schedule that is set by the tenant. In this regard, the data connector can be regarded as a Java application that is running SQL queries against the desired data sources. As such, samplings of the data source content are retrieved into the data connector through those SQL queries. This content retrieval is required for the data connector to determine whether or not the given data source contains GDPR-relevant data. This processing happens while the data source content is stored in main memory. The content is never persisted to disk. When your data sources are scanned, Guardium Data Connector persists the results in its local database. The results are metadata about the tables and columns that contain the personal and sensitive personal data - and not the data itself. This metadata also includes the list of vulnerabilities found in the data source (if vulnerabilities exist). The results are kept within the data source only until they are uploaded to Guardium Analyzer on the cloud. After this, they are discarded from the data connector local database.

Communication between the data connector and the Guardium Analyzer

Besides performing the data source scanning, the other key responsibility of the data connector is uploading scan results to Guardium Analyzer. Even though the uploaded material is only metadata representing the results of the data source scanning, the following measures are always enforced.

  • Initiation of the communication: Any communication with Guardium Analyzer is always initiated by the data connector. The data connector itself does not accept incoming requests from the outside world.
  • Encryption of the communication: The communication between the data connector and Guardium Analyzer is always protected with Transport Layer Security (TLS), which is the standard for protecting communications.

In Guardium Analyzer, the communication is authenticated and the tenant associated with the data connector is identified so that the uploaded results are stored in Cloudant databases specifically dedicated to that tenant.

Adding and removing data source definitions in the data connector

The tenant is responsible for defining the data sources to be scanned by the data connector. Because the data connector is analogous to a Java application making JDBC requests to a data source, the key attributes required when defining a data source are similar to those required by this type of application. That is, the definition needs to include the data source name, the user ID and password to connect to that data source, as well as the data source location (IP address and port number). This information is stored locally within the data connector and is used to perform data source scanning.

At their own discretion, the tenant may choose to remove one or more data source definitions from the data connector, at which time all key data source attributes described above will automatically be deleted from the data connector's local database.

Tenant offboarding

If a tenant unsubscribes from Guardium Analyzer, they will automatically be offboarded. This means that all traces of this tenant in Guardium Analyzer will be destroyed. More specifically, their tenant metadata and dedicated Cloudant databases discussed earlier will be destroyed.

The data connector portions of the service will be the responsibility of the tenant, as they are owned and managed by them in the confines of their own environment. It is recommended that the tenant uninstall any data connectors they have upon leaving the Guardium Analyzer service.

Geographical implications

Guardium Analyzer is physically located in an IBM data center in Frankfurt, Germany. This is the primary data center. There is also a secondary IBM data center in Amsterdam, The Netherlands (for business continuity purposes). IBM Marketplace includes a rich suite of compliance certifications, including ISO 27001, SOC1, and SOC2. The full suite can be found at https://www.ibm.com/cloud/compliance.

The tenant owns and manages the data connector component in the confines of their own environment. The tenant needs to understand that the data connector accesses their data sources much like a JDBC application does. In this regard, samplings of the data source content are retrieved through SQL queries by the data connector. Even though the data connector never persists this data source content, the fact is the data has traveled from the data source to the data connector. Tenants who choose to deploy a data connector outside of Europe to access a data source located in Europe should be aware of this data transfer. It is recommended that data connectors be geographically co-located with the data sources that they will scan.