Manta Flow Installation Guide: Linux
Important information
Planning
Before beginning the Manta Data Lineage installation process, you should have a good understanding and plan of action for a few key areas:
-
Infrastructure:
-
Manta Data Lineage has been tested and can be installed on the following types of infrastructure:
-
Standalone host server (Physical or Virtual Machine On-Prem)
-
Cloud-Hosted VM host (E.g. AWS, Azure, Redshift)
-
-
Both configurations can either make use of a load balancer/reverse-proxy (Nginx, AWS LB, Azure LB, etc.) or without
-
If a LB/Reverse-Proxy will be used in front Manta Data Lineage, see the “Load Balancer / Reverse Proxy” section below for additional information
-
-
Authentication:
-
During initial installation, Manta Data Lineage will utilize its native user-account authentication, and prompt you for an Administrative username/password. This initial account is critical in that it can be thought of as the “super-user” administrative account for the Manta platform. Initially, this account is the only means of authenticating into the Keycloak service, which is where the administrator manages all authentication/authorization controls surrounding the Manta platform.
-
Aside from the included user-account native authentication method, Manta Data Lineage also supports LDAP and SSO/SAML authentication mechanisms for end-users, which are both configurable in the Keycloak Admin Console after installation is complete.
-
-
Overriding the Server Hostname:
- During installation, you will be prompted to either change the default hostname (localhost) or not. This setting is important to correctly update to ensure the Manta Launcher is reachable, as well as to be able to correctly communicate with the many other Manta Data Lineage services during the startup procedure. It is best to always plan to override this default value during installation.
-
Overriding the Keycloak Public URL:
-
In most cases, the Keycloak Public URL will be overridden to match the proper FQDN of the Manta server, otherwise, it will default to
http://localhost:9090/auth
, which will cause end-users to be unable to authenticate to the Keycloak service due to a mismatch with the OIDC Issuer URI. Most of the time, this Keycloak Public URL will be configured to reflect the previously set, overridden “hostname” property. -
One example of when/how this URL will be overridden is when a load balancer/reverse-proxy is used between the end-users and Manta server, so that the end-users are sending HTTP requests for each Manta Data Lineage service over a singular, secure, public URL that are being appropriately forwarded from the LB to the respective upstream service’s port. With this type of scenario/infrastructure, we would want to override the Keycloak Public URL to match the appropriate public-facing URL/FQDN of the load balancer. Here are just a couple examples:
-
https://manta-test.mydomain.com/auth
-
http://prd-manta-srv:9090/auth
-
-
-
Using A Load Balancer/Reverse-Proxy for TLS Termination with Manta Data Lineage:
-
It is common for a LB/Reverse-Proxy to be used when installing on cloud-hosted infrastructure like AWS or Azure, primarily for the purpose of TLS termination at the LB. If this the use of a LB/Reverse-Proxy is intended, read carefully the following key points:
-
Assuming a valid SSL/TLS certificate has been generated with the proper, desired common name/alternative DNS names/IP’s that will be issued to the LB or host, as well as all relevant DNS updates have been made, only the public (end-user)-facing URL to the Manta LB should be provided to end-users for accessing the various Manta services' interfaces.
-
Typically, with this type of configuration, it is common to use the default, insecure ports for each of the Manta Data Lineage services on the host machine itself. In this case, the LB should be configured to listen on a single, public-facing secure port (I.e. 443), and should have separate “forwarding” rules for each of the upstream Manta Data Lineage services (Their respective, default ports).
-
-
Caddy Reverse-Proxy Service:
-
Caddy is an optional, reverse-proxy server that simplifies setup of HTTPS and firewall rules that comes packaged with the Manta platform. HTTPS is almost always the desired communication protocol to ensure secure end-to-end communication. Prior to R41, setting up HTTPS in Manta Data Lineage was a complicated process as the configuration was in different config files for each Manta Data Lineage component.
-
Caddy is placed between the user and the Manta Data Lineage components acting as network gateway aggregating and routing all traffic between the user’s machine and individual Manta Data Lineage components. The secure connection from the user is terminated at Caddy and the rest of the components within Manta installation can be setup without HTTPS. As Caddy is the only public facing service, this setup is still completely secure. This network architecture has added benefit, because Caddy is the only service that needs to be exposed publicly, only its port must be enabled in firewall.
-
If use of the Caddy reverse-proxy service is desired, this can be enabled during the initial install process, or it can also be enabled at a later point in time post-installation if needed. For more detailed information surrounding Caddy server, see: Caddy proxy.
-
-
Installation instructions
Follow these steps to install Manta Data Lineage:
-
Download the latest Manta installation archive for Linux OS and license from IBM Support.
-
On the host server, extract the installation files:
mantaflow-linux-42.y.z.zip
-
Execute the installer application file with Super User privileges:
mantaflow-42.y.z-linux-x64-installer.run
-
Enter the path of the desired Manta installation directory.
-
Choose the correct version of Java to use with Manta Data Lineage and then press OK.
-
There are certain vendors and versions of Java that are unsupported. Review them here: IBM Manta Data Lineage Technical Requirements.
-
-
Choose to override the default server hostname. Typically, it will be desired to override the default hostname from localhost to the Hostname or Fully Qualified Domain Name (FQDN).
-
It is recommended to use the hostname of the machine during the installation. After the installation is complete, you can change this by adjusting the
/<manta_dir>/conf/manta.properties
file. Updating every mention of the hostname to the FQDN if desired.
-
-
Choose whether to install Manta Data Lineage as a system service.
-
Installing as a service provides many benefits, and is always encouraged.
-
Continuous Availability: Services can start automatically with the operating system, ensuring that the application is always available, even after reboots. This is crucial for critical applications that need to run continuously.
-
Ease of Deployment: Services can be deployed without user interaction, making the deployment process more straightforward. This is particularly useful in enterprise environments where a large number of systems need to be configured.
-
-
In some cases, the Linux OS flavor Manta Data Lineage is being installed on may use
systemd
as the systems service manager rather than its predecessor,sysvinit (init.d)
. This will be more commonly seen in newer RHEL-based Linux distributions. In this case, the Manta installer may fail to properly create theMantaLauncher
system service, as it currently will only install the system service when usingsysvinit
service manager. If the setup of the system service fails for you, see the following article for details on how to manually create theMantaLauncher
system service on distributions usingsystemd
: Creating A System Service for systemd-based Linux Distros
-
-
Choose to change the ownership of the
mantaflow
directory from the root user to the service account, if desired. This is important for several reasons, such as security concerns, Principle of Least Privilege (PoLP), isolation, and access/auditing reasons. If yes, enter an existing OS user then group for the ownership. -
Input the absolute path to the
license.key
file provided by Manta Data Lineage. -
Configure the Keycloak port, username, and password.
These credentials will be used for the creation of the initial Manta administrator account.-
If the machine leverages a load balancer or reverse proxy, it is recommended to review the following article to ensure you are configuring the Public Keycloak URL correctly: Networking Setup Examples
-
If the server hostname was overridden in step 6, choose yes to override the Public Keycloak URL and replace “localhost” with the previously specified hostname. The Public Keycloak URL should look like this:
http://<HOSTNAME>:$PORT/auth
.
-
-
Configure the Artemis port.
-
Configure the Configuration Service port.
-
Configure the Open Manta Designer (OMD) port.
-
Configure the Launcher port.
-
Configure the Admin UI port.
-
Configure the Flow Server and Neo4j ports.
-
Configure the Agent port.
-
If using Caddy reverse-proxy service that comes packaged with Manta Data Lineage is desired, choose to enable and configure the two ports.
-
If using Caddy, all Manta Data Lineage services communication will be redirected through the defined Caddy HTTPS proxy port.
-
Caddy uses a self-signed certificate. If you wish to use an internal CA signed certificate, see how to configure this post installation here.
-
If no problems are detected, press ENTER to continue.
-
If you are unsure whether to use Caddy proxy or not, review the Use Cases:
-
Manta Data Lineage installed with HTTP only. Proxy or loadbalancer is not used. |
Enable Caddy and use the HTTPS communication provided by Caddy. |
Manta Data Lineage installed with HTTP only. Proxy or loadbalancer is used. |
Leave Caddy disabled. Caddy cannot be used together with another proxy or loadbalancer. |
-
You are now ready to begin the installation, and will type “y” to begin.
- You will notice a progress bar to provide the installation progression. Depending on available resources on the host machine, the installation process may take several minutes — allow adequate time for the installer to complete.
-
Once complete, you may choose to launch the Manta Applications or not. The installation will now be complete.
-
To verify the Manta installation directory contains the newly added subdirectories for each service, you may move into your specified install location specified in step 4, and list the contents where you should see many subdirectories/files.
-
-
Once you have started the Manta Launcher service (Either via last prompt in the installation script or via a manual startup), you can open a web browser and enter the URL for the Manta Launcher webpage:
http://<HOSTNAME>:<LAUNCHER_PORT>/manta-launcher
. This will list out all services with their respective URLs for accessing, as well as their statuses. You are now ready to begin using Manta Data Lineage.