Secure Engineering Practices

IBM Manta Data Lineage, follows IBM Security and Privacy by Design (SPbD). Security and Privacy by Design (SPbD) at IBM is a set of focused security and privacy practices, including vulnerability management, threat modeling, penetration testing, privacy assessments, security testing, and patch management. For more information about the IBM Secure Engineering Framework (SEF) and SPbD, see the following resources:

Authentication and authorization

User Roles

The application uses several user roles to authorize specific operations within the metadata repository. The roles are as follows:

ROLE_USER — basic role needed for all secured pages, which by default is all pages
ROLE_EXPORTER — exports data from the repository
ROLE_LIGHTHOUSE _READ — scans the metadata repository and searches for particular patterns or evaluates rules
ROLE_MERGER — executes data-modifying operations such as merging objects, truncating databases, and propagating edges
ROLE_VIEWER_DATAFLOW — executes operations for the visualization of data flow
ROLE_VIEWER_CATALOG — explores and searches the metadata catalog
ROLE_USAGE — exports and cleans server-usage metadata
ROLE_REPOSITORY_READ — executes data-reading operations via repository API
ROLE_REPOSITORY_WRITE — executes data-modifying operations via repository API
ROLE_REPOSITORY_EVALUATE — executes special data-modifying operations based on DSL-script evaluation via repository API

Additional roles can be created by starting to use them as previously described. Creating new roles is only useful for the section Access Rights for the Metadata Repository.

Access Rights for the Metadata Repository

It is possible to define access rights to the metadata repository. This means that some parts of the metadata repository may only be visible to particular users.

Enabling and Disabling Access Rights

To enable this feature, set the repository.permissions-enabled property in the configuration Configurations > Server > Common > Repository Configuration in Manta Admin UI to true. To disable this feature, set that property to false. In this case, the entire repository will be accessible to all users.

Defining Access Rights

If the access rights feature is enabled, it is possible to configure metadata repository permissions for Manta users. This is done on three levels:

Level	Location of definition	Description
Repository view configuration	Manta	A repository view is any part of the repository. It is defined as a set of included and excluded repository subtrees.
Assigning repository views to user roles	Manta	Each role has a set of repository views assigned to it that are accessible to users with this role. It is possible to assign no view (i.e. no part of the repository is accessible), as well as the entire repository. One view can be assigned to more than one role.
Assigning users to user roles	External systems (e.g., LDAP) or Manta	Each user can be assigned one or more roles and vice versa. There is no need to define the roles explicitly in Manta.

Repository View Configuration

Repository views can be configured in the configuration Configurations > Server > Common > Repository Views in Manta Admin UI. The rows are records of the inclusion of repository objects in the view or the exclusion of them from the view. The record fields are as follows:

Repository view Name of the repository view the record applies to

Type Type of record. The value must be either INCLUDE or EXCLUDE. If it is INCLUDE, the affected objects are included in the view. If it is EXCLUDE, the affected objects are excluded from the view. The exclusions precede the inclusions. If only EXCLUDE records are defined for the view, the rest of the repository is considered to be included in the view.

Affected objects Case-insensitive regular expressions of repository path entries separated by slashes (/). A repository object is affected if and only if:
1.each entry of its path matches the corresponding regular expression,
2. or any of its ancestors fulfill point one.
The object resource is the first object path entry.
Special cases:
• To enclose a path entry in double quotes ("), use the \" sequence.
• To use double quotes as part of the path entry, use the \"\" sequence.
• To use a backslash (\) as part of the path entry, use the \\ sequence. Note that a backslash in an object's path needs to be escaped twice — once for the CSV file and another time for the regex.

Repository view	Name of the repository view the record applies to
Type	Type of record. The value must be either INCLUDE or EXCLUDE. If it is INCLUDE, the affected objects are included in the view. If it is EXCLUDE, the affected objects are excluded from the view. The exclusions precede the inclusions. If only EXCLUDE records are defined for the view, the rest of the repository is considered to be included in the view.
Affected objects	Case-insensitive regular expressions of repository path entries separated by slashes (/). A repository object is affected if and only if: 1.each entry of its path matches the corresponding regular expression, 2. or any of its ancestors fulfill point one. The object resource is the first object path entry. Special cases: • To enclose a path entry in double quotes `(")`, use the `\"` sequence. • To use double quotes as part of the path entry, use the `\"\"` sequence. • To use a backslash (`\`) as part of the path entry, use the `\\` sequence. Note that a backslash in an object's path needs to be escaped twice — once for the CSV file and another time for the regex.

Assigning Repository Views to User Roles

The assignment of repository views to user roles can be configured in the configuration Configurations > Server > Common > Repository Views Permissions in Manta Admin UI. The rows are records of view-to-role assignments. The record fields are as follows:

Role	The user role the record applies to; must be unique within the file
Repository views	*Comma-separated list of views accessible to the role; if set to ``, users with this role have access to the entire repository**

Applying the Changes

To apply changes to the preceding CSV configuration files, it is necessary to restart the Manta Server or enter an HTTP GET request using the following format: http://<server_name>:<port>/manta-dataflow-server/api/refresh, where the <server_name> and <port> are provided by your application administrator. If the repository.permissions-enabled property has been changed, a Manta Server restart is necessary.

Tokens and API keys

Step 1: Create an API Client Configuration

To set up a client of the Keycloak server to obtain the tokens used when communicating with the Manta API:

Log in to your Keycloak instance.
Go to the Clients section.
Click Create Client.
Under Client Type select OpenID Connect.
Enter a Client ID (for example: “api-client”). This ID is sent when you obtain the token from the Keycloak server. Your client ID must be unique in the realm.
Enter a Name for the client.
Click Next.
Turn the Client Authentication toggle to On.
Select Service Accounts Roles next to Authentication Flow.
Click Next.
Set Valid Redirect URIs to *, Valid Post Logout Redirect URIs to +, and Web Origins to *.
Click Save.

Regenerate Secrets for the New Client

Open the client you just created, and go to the Credentials tab.
Ensure that the value next to Client Authenticator says Client Id and Secret.
Click Regenerate Secret, and then click Yes in the confirmation dialog.

Configure Account Roles

To enable access to Manta Data Lineage platform resources, you need to assign roles to the client.

Go to the client’s Service Account Roles tab.
Select the roles that are appropriate for your use case.
- Check Manta Flow Server Authentication and Authorization and User Roles Used in Manta Admin UI to see which roles are available.

Step 2: Request an Access Token for Authentication

The token needed to access the Manta API is obtained from the Keycloak server. The following examples will use cURL as an HTTP client, however, you can use any client. This guide assumes that the Keycloak server is accessible at the URL: http://localhost:9090/auth. Adjust the URL according to your setup.

To get the token, you have to send an HTTP POST request to the following endpoint: http://localhost:9090/auth/realms/manta/protocol/openid-connect/token

The body of the request is formatted as application/x-www-form-urlencoded:

grant_type=client_credentials
client_id is selected when creating the client
client_secret is regenerated during client creation

In cURL syntax, the request can look as follows:

--location --request POST 'http://localhost:9090/auth/realms/manta/protocol/openid-connect/token' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'grant_type=client_credentials' \
--data-urlencode 'client_secret=0kuGOk0Wma4kdFO1Kqf5E7Ht9ZqxU0cU' \
--data-urlencode 'client_id=api-client`

This request will return JSON, which contains the access token required to access the API in JWT format. The response could look as follows:

{
    "access_token": "eyJhbGciOiJSUz.....CuAtGy7Hdw6D-6A",
    "expires_in": 18000,
    "refresh_expires_in": 0,
    "token_type": "Bearer",
    "not-before-policy": 1635950040,
    "scope": "email profile"
}

The field access_token contains the actual token needed to access the API. The refresh token is not included by design. For more details see RFC 6749: The OAuth 2.0 Authorization Framework .

Accessing the Manta API

To access the Manta API, set the authorization header in the request to the value Bearer <token> where <token> is the value obtained from Keycloak. For example, to get a list of workflow templates from the Admin GUI server, you can use the following cURL command:

-H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsI ..." \
http://localhost:8181/manta-admin-gui/public/process-manager/v1/workflow/templates

The access_token obtained from the Keycloak server is used to access the Admin GUI server. In this example, the token is truncated for clarity. In a real use case, you have to provide a complete token.

Token Lifespan

In the previous example, you can see that the tokens have different lifespans. Best practice is to keep the short lifespan of the access_token (minutes) and the long lifespan of the refresh token (hours or even days).

The token lifespan is configured for the whole realm in the Sessions section of Realm Settings.

These are global settings that apply to all clients configured in the realm. Each client can then be further configured under Advanced Settings in the Client Details section. These settings override the global settings for the whole realm.

The refresh token lifespan will be equal to the smallest value among SSO Session Idle, Client Session Idle, SSO Session Max, and Client Session Max.

Encryption

Manta Data Lineage supports protection of data at rest and in motion.

Data

Data resides on customers hard drives and we recommend volume encryption to keep all data safe.

Communications

You can use TLS or SSL to encrypt communications to and from Manta Data Lineage.

FIPS

Manta Data Lineage supports FIPS (Federal Information Processing Standard) compliant encryption.

Using an allowlist to prevent SSRF attacks

In a Server Side Request Forgery (SSRF) attack, an attacker can create requests from a vulnerable server. Typically, this happens when an application accepts URLs, IP addresses, or domain names from a user who has access to the server. The attacker can use this vulnerability to inject URLs with port details or with internal IP addresses, and then observe the internal network or enable the application to process malicious code.

The most robust way to avoid an SSRF attack is to set up an allowlist for the DNS name or IP address that your application needs to access. Alternatively, if you use a blocklist, it's important to validate the user input properly. For example, do not allow requests to private (nonroutable) IP addresses. This can be configured on Keycloak.

Additional security measures

To protect your Manta Data Lineage instance, consider the following best practice.

Setting up an elastic load balancer

To filter out unwanted network traffic, such as protecting against Distributed Denial of Service (DDoS) attacks, use an elastic load balancer that accepts only full HTTP connections. Using an elastic load balancer that is configured with an HTTP profile inspects the packets and forward only the HTTP requests that are complete to the Manta web server.