GitHubContribute in GitHub: Edit online

Caddy proxy

Caddy is a reverse proxy server that simplifies setup of HTTPS and firewall rules. After enabling Caddy navigate to https://{hostname}:8888/, there you will be automatically redirected to Manta Launcher.

Introduction

As of R41, Caddy (see caddyserver.com for more details) is a new optional component of IBM Manta Data Lineage that simplifies the setup of HTTPS. You can select the Caddy integration during the installation.

Purpose

HTTPS is a more secure mode of communication over network. As such, it offers encryption to protect the data exchanged between involved parties. Prior to R41, setting up HTTPS in Manta Data Lineage was a complicated process as the configuration was in different config files for each Manta Data Lineage component.

Caddy is placed between the user and the Manta Data Lineage components acting as network gateway aggregating and routing all traffic between the user’s machine and individual Manta Data Lineage components. The secure connection from the user is terminated at Caddy and the rest of the components within Manta Data Lineage installation can be setup without HTTPS. As Caddy is the only public facing service, this setup is still completely secure. This network architecture has added benefit, because Caddy is the only service that needs to be exposed publicly, only its port must be enabled in firewall.

Networking diagrams

Networking diagram: Networking setup without CaddyNetwork setup without CaddyNetworkingNetworking diagram with Caddy

Use case scenarios

Review your overal network configuration to see if and how Caddy should be enabled or not.

Manta Data Lineage installed with HTTP only.

Proxy or loadbalancer is not used.

Enable Caddy and use the HTTPS communication provided by Caddy.

Manta Data Lineage installed with HTTPS.

Proxy or loadbalancer is not used.

Disable all HTTPS setting in the individual components.

Enable Caddy and use the HTTPS communication provided by Caddy.

Manta Data Lineage installed with HTTP only.

Proxy or loadbalancer is used.

Leave Caddy disabled. Caddy cannot be used together with another proxy or loadbalancer.

Configuration & setup

Ports

During installation user must configure two new port to use with Caddy. Caddy itself requires two ports, one communication port to handle all the traffic, and second port for administration API. Only the communication port must be publicly available. The administration API of Caddy server, must be protected by suitable firewall rules. Manta Data Lineage recommends the port numbers 8888 (communication port) and 8889 (administration API port) as the defaults. If you choose port numbers higher than 1024, then Caddy can and should run without root privileges (ports below 1024, are restricted on Linux).

HTTPS

Caddy has HTTPS management enabled out of the box. This means that even without any user intervention all traffic is protected. Caddy achieves this by generating its own Certificate Authority (certificate can be found in /data/local/pki/ca.crt) which then signs the actual HTTPS certificated deployed.

To deploy custom certificates, reference them in the Caddy configuration file. In ${mantaflow}/caddy/conf/Caddyfile find a line starting with

# tls <cert file> <key file>

Remove the hash mark at the start, and replace the placeholder with absolute paths to the desired HTTPS certificate and key files. Place the certificate files outside the mantaflow directory, so they are preserved during potential upgrade. On Linux installations, the files has to have same owner and user group as the rest of the Manta Data Lineage installation. This way Caddy will be able to access them correctly. In R41 it is best to put them outside Manta directory. Starting in R42 use the dedicated configuration directory ${installdir}/conf.

If you already have deployed HTTPS keys in the past, you can easily convert the keystore to the format required by Caddy. Caddy is not a Java tool, so using the Java keystore directly is not possible, Caddy requires that the key and certificate are provided in the PEM format. Key file in PEM format is a text file, that starts with following header: -----BEGIN PRIVATE KEY----- the PEM certificate is also a text file, but the header is following -----BEGIN CERTIFICATE-----

To convert the existing JKS (Java KeyStore) to the required PEM format, you will need the openssl command line tool. On Linux installations this is readily available, for Windows you have to download and install it separately.

  1. Convert the JKS keystore to portable PKCS12 format:

    keytool -importkeystore -srckeystore keystore.jks -destkeystore keystore.p12 -srcstoretype jks -deststoretype pkcs12
    
  2. Export the certificate from the keystore:

    openssl pkcs12 -in keystore.p12 -out newfile.crt.pem -clcerts -nokeys
    
  3. Export the private key from the keystore:

    openssl pkcs12 -in keystore.p12 -out newfile.key.pem -nocerts -nodes
    

The resulting files are then referenced in Caddyfile.

The certificate and private key must not be encrypted. Caddy does not support encrypted keys.

To verify the certificate has been installed correctly launch Caddy using its bin/startup.[sh|bat] script, and open in browser the URL https://localhost:8888/health. Adjust the port number according to your configuration. In the browser you can easily inspect the certificate information.

Caddy configuration

This is a simplified version of the Caddy server integration into the Manta Data Lineage product. For detailed configuration of Caddy itself consult the official documentation on caddyserver.com.

Simplified configuration

Caddy uses the global configuration file ${installdir}/conf/manta.properties . In this file you can also configure the Caddy ports. The values from this configuration file are inject into the actual Caddy configuration file, located in conf/Caddyfile. This configuration file contains the actual configuration documented on https://caddyserver.com/docs/caddyfile. This file uses placeholders that are replaced from variables setup in the environment. Those variables are set in setenv_manta.{sh|bat} script, and injected into the environment via the startup script.

Integration into Manta Data Lineage

Caddy is part of every installation starting with version R41. In R41 this is a preview feature. By default, the Caddy will only be configured, but it will not be enabled.

In following example replace the placeholder ${installdir} with absolute path to the Manta installation directory. Usually /opt/mantaflow or C:\mantaflow depending on the OS.

Enabling

  1. Shutdown Manta Data Lineage using Manta Launcher

  2. Navigate to ${installdir}/utility/

  3. Run command java -jar manta-installer-dep-caddy.jar -m ENABLE -r ${installdir}

  4. Start Manta Data Lineage using Launcher

Caddy is integrated into the Manta Launcher, and you can see its status in the Launcher dashboard.

Disabling

  1. Shutdown Manta Data Lineageusing Manta Launcher

  2. Navigate to ${installdir}/utility/

  3. Run command java -jar manta-installer-dep-caddy.jar -m DISABLE -r ${installdir}

  4. Start Manta Data Lineage using Manta Launcher

Caddy is integrated into the Manta Launcher, and you can see its status in the Launcher dashboard.

Caddy accessible domains

As of R42.4, it is possible to specify which domains should Caddy be listening on and responding to. By default, Caddy listens on the system hostname provided during installation (manta.system.hostname property in <installDir>/conf/manta.properties) and on hostname parsed from manta.keycloak.public.url (provided during installation, can be found in <installDir>/conf/manta.properties). Additional domains can be specified using the manta.caddy.hostnames property in <installDir>/conf/manta.properties.

If there are any changes made in the Caddy configuration, it is enough to restart just the Caddy. You do not have to restart all Manta Data Lineage components. This is true for example when deploying new HTTPS certificates which means that the certificates can be changed even in the middle of a long-running scan. Manta Data Lineage services are not available to the users when Caddy is not running. While Caddy supports zero downtime configuration changes, this feature is not supported by Manta Data Lineage yet.

Network integration

All the other HTTP ports previously required by Manta Data Lineage can be disabled in the firewall. Ensure that the Artemis port (default 61616) is still enabled in the firewall. Artemis is not HTTP based service, and as such it cannot leverage the HTTPS features provided by Caddy. The Artemis communication with Agent is secured using mTLS mechanism. This is already setup automatically.

If Caddy is disabled, all the ports need to be opened again, to allow Manta Data Lineage to function correctly.

End-Users Browser Bookmarks

When Caddy is enabled all the traffic is directed through it. This means that all the services are available on new URLs. Caddy integration is setup in such a way that only the port and protocol have been changed. For example if you have previously accessed Manta Flow Server on the URL http://localhost:8080/manta-flow-viewer then the new URL will be https://localhost:8888/manta-flow-viewer, i.e. the protocol has been changed from http:// to https:// and the port has been changed from 8080 to 8888 (this assumes the default ports are configured, in case you are using different port numbers or host name, change the URLs accordingly).

Using Caddy to hide port numbers from URLs

If default port numbers are used all the URLs in the Manta Data Lineage will have the port number attached e.g. https://localhost:8888/manta-flow-viewer. The port number is required to be available, the only exceptions are the ports 80 (plain HTTP) and 443 (HTTPS). Those ports are not required to be part of the URL.

To run the Caddy on the port 443, following changes are required in the Manta configuration file ${installdir}/conf/manta.properties:

Default property value Updated property value
manta.caddy.port=8888 manta.caddy.port=443
manta.keycloak.public.url=https://localhost:8888/auth manta.keycloak.public.url=https://localhost/auth
manta.launcher.url=https://localhost:8888 manta.launcher.url=https://localhost
All ports lower than 1024 are restricted on Linux. This means only user with root privileges is able to use them! This means that Manta Launcher has to be launcher with the root privileges.

Working around the port number restrictions on Linux

As explained above, on Linux OS the port 443 is restricted, and only user with root privileges is allowed to use it. Given the architecture of Manta Data Lineage, this means that the Launcher has to be started under root account. This means that the whole application is running as root. From security perspective, this approach is not preferred. Strictly only Caddy needs to run with root privileges, and the rest of the components can be running under more restricted user.

Option #1

Run Manta Launcher with root. Easiest solution, but least secure.

Option #2

Edit each startup.sh script, for each component except Caddy and Launcher, and add following line:
if [ $UID -eq 0 ]; then exec runuser -u $USER "$0" -- "$@"; fi

like this:

#!/bin/bash

if [ $UID -eq 0 ]; then exec runuser -u $USER  "$0" -- "$@"; fi

echo "Starting Manta Keycloak"
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")

Replace $USER with actual username under which is the component supposed to run.

Then start Manta Launcher as a root.

This way only Caddy and Launcher will be running under root user, the rest of the product will run under the selected user.

Safer than #1 and more convenient, but Manta Launcher still runs as root.

Option #3

Edit the file $installdir/launcher/manta-launcher-dir/conf/applications.json and change the Keycloak dependencies entry. Switch the "caddy" dependency from STARTUP to RUNTIME.

{
    "id": "keycloak",
    "name": "Keycloak",
...
    "dependencies": [
        {
            "application": "launcher",
            "type": "STARTUP"
        },
        {
            "application": "caddy",
            "type": "RUNTIME"  <---------- WAS STARTUP, CHANGE TO RUNTIME
        }
    ],
...

Start Launcher as a regular user. This way the Launcher will try to start Caddy which will fail. This is expected, as the regular user cannot use the port 443. Keep Manta Launcher running and start Caddy manually under the root user like this:
sudo $installdir/caddy/bin/startup.sh

Safest option, only Caddy runs under root user. Least convenient method, as the Caddy server has to be started manually, without the help of Manta Launcher.