Caddy proxy
Caddy is a reverse proxy server that simplifies setup of HTTPS and firewall rules. After enabling Caddy navigate to
https://{hostname}:8888/
, there you will be automatically redirected to Manta Launcher.
Introduction
As of R41, Caddy (see caddyserver.com for more details) is a new optional component of IBM Manta Data Lineage that simplifies the setup of HTTPS. You can select the Caddy integration during the installation.
Purpose
HTTPS is a more secure mode of communication over network. As such, it offers encryption to protect the data exchanged between involved parties. Prior to R41, setting up HTTPS in Manta Data Lineage was a complicated process as the configuration was in different config files for each Manta Data Lineage component.
Caddy is placed between the user and the Manta Data Lineage components acting as network gateway aggregating and routing all traffic between the user’s machine and individual Manta Data Lineage components. The secure connection from the user is terminated at Caddy and the rest of the components within Manta Data Lineage installation can be setup without HTTPS. As Caddy is the only public facing service, this setup is still completely secure. This network architecture has added benefit, because Caddy is the only service that needs to be exposed publicly, only its port must be enabled in firewall.
Networking diagrams
Network setup without CaddyNetworking diagram with Caddy
Use case scenarios
Review your overal network configuration to see if and how Caddy should be enabled or not.
Manta Data Lineage installed with HTTP only. Proxy or loadbalancer is not used. |
Enable Caddy and use the HTTPS communication provided by Caddy. |
Manta Data Lineage installed with HTTPS. Proxy or loadbalancer is not used. |
Disable all HTTPS setting in the individual components. Enable Caddy and use the HTTPS communication provided by Caddy. |
Manta Data Lineage installed with HTTP only. Proxy or loadbalancer is used. |
Leave Caddy disabled. Caddy cannot be used together with another proxy or loadbalancer. |
Configuration & setup
Ports
During installation user must configure two new port to use with Caddy. Caddy itself requires two ports, one communication port to handle all the traffic, and second port for administration API. Only the communication port must be publicly available. The administration API of Caddy server, must be protected by suitable firewall rules. Manta Data Lineage recommends the port numbers 8888 (communication port) and 8889 (administration API port) as the defaults. If you choose port numbers higher than 1024, then Caddy can and should run without root privileges (ports below 1024, are restricted on Linux).
HTTPS
Caddy has HTTPS management enabled out of the box. This means that even without any user intervention all traffic is protected. Caddy achieves this by generating its own Certificate Authority (certificate can be found in /data/local/pki/ca.crt
)
which then signs the actual HTTPS certificated deployed.
To deploy custom certificates, reference them in the Caddy configuration file. In ${mantaflow}/caddy/conf/Caddyfile
find a line starting with
# tls <cert file> <key file>
Remove the hash mark at the start, and replace the placeholder with absolute paths to the desired HTTPS certificate and key files. Place the certificate files outside the mantaflow directory, so they are preserved during potential upgrade. On Linux
installations, the files has to have same owner and user group as the rest of the Manta Data Lineage installation. This way Caddy will be able to access them correctly. In R41 it is best to put them outside Manta directory. Starting in R42 use
the dedicated configuration directory ${installdir}/conf
.
If you already have deployed HTTPS keys in the past, you can easily convert the keystore to the format required by Caddy. Caddy is not a Java tool, so using the Java keystore directly is not possible, Caddy requires that the key and certificate
are provided in the PEM format. Key file in PEM format is a text file, that starts with following header: -----BEGIN PRIVATE KEY-----
the PEM certificate is also a text file, but the header is following -----BEGIN CERTIFICATE-----
To convert the existing JKS (Java KeyStore) to the required PEM format, you will need the openssl command line tool. On Linux installations this is readily available, for Windows you have to download and install it separately.
-
Convert the JKS keystore to portable PKCS12 format:
keytool -importkeystore -srckeystore keystore.jks -destkeystore keystore.p12 -srcstoretype jks -deststoretype pkcs12
-
Export the certificate from the keystore:
openssl pkcs12 -in keystore.p12 -out newfile.crt.pem -clcerts -nokeys
-
Export the private key from the keystore:
openssl pkcs12 -in keystore.p12 -out newfile.key.pem -nocerts -nodes
The resulting files are then referenced in Caddyfile.
To verify the certificate has been installed correctly launch Caddy using its bin/startup.[sh|bat]
script, and open in browser the URL
https://localhost:8888/health
. Adjust the port number according to your configuration. In the browser you can easily inspect the certificate information.
Caddy configuration
This is a simplified version of the Caddy server integration into the Manta Data Lineage product. For detailed configuration of Caddy itself consult the official documentation on caddyserver.com.
Simplified configuration
Caddy uses the global configuration file
${installdir}/conf/manta.properties
. In this file you can also configure the Caddy ports. The values from this configuration file are inject into the actual Caddy configuration file, located in
conf/Caddyfile
. This configuration file contains the actual configuration documented on
https://caddyserver.com/docs/caddyfile. This file uses placeholders that are replaced from variables setup in the environment. Those variables are set in setenv_manta.{sh|bat}
script, and injected into the environment via the startup script.
Integration into Manta Data Lineage
Caddy is part of every installation starting with version R41. In R41 this is a preview feature. By default, the Caddy will only be configured, but it will not be enabled.
In following example replace the placeholder ${installdir}
with absolute path to the Manta installation directory. Usually
/opt/mantaflow
or C:\mantaflow
depending on the OS.
Enabling
-
Shutdown Manta Data Lineage using Manta Launcher
-
Navigate to
${installdir}/utility/
-
Run command
java -jar manta-installer-dep-caddy.jar -m ENABLE -r ${installdir}
-
Start Manta Data Lineage using Launcher
Caddy is integrated into the Manta Launcher, and you can see its status in the Launcher dashboard.
Disabling
-
Shutdown Manta Data Lineageusing Manta Launcher
-
Navigate to
${installdir}/utility/
-
Run command
java -jar manta-installer-dep-caddy.jar -m DISABLE -r ${installdir}
-
Start Manta Data Lineage using Manta Launcher
Caddy is integrated into the Manta Launcher, and you can see its status in the Launcher dashboard.
Caddy accessible domains
As of R42.4, it is possible to specify which domains should Caddy be listening on and responding to. By default, Caddy listens on the system hostname provided during installation (manta.system.hostname
property in <installDir>/conf/manta.properties
)
and on hostname parsed from
manta.keycloak.public.url
(provided during installation, can be found in <installDir>/conf/manta.properties
). Additional domains can be specified using the manta.caddy.hostnames
property in
<installDir>/conf/manta.properties
.
If there are any changes made in the Caddy configuration, it is enough to restart just the Caddy. You do not have to restart all Manta Data Lineage components. This is true for example when deploying new HTTPS certificates which means that the certificates can be changed even in the middle of a long-running scan. Manta Data Lineage services are not available to the users when Caddy is not running. While Caddy supports zero downtime configuration changes, this feature is not supported by Manta Data Lineage yet.
Network integration
All the other HTTP ports previously required by Manta Data Lineage can be disabled in the firewall. Ensure that the Artemis port (default 61616) is still enabled in the firewall. Artemis is not HTTP based service, and as such it cannot leverage the HTTPS features provided by Caddy. The Artemis communication with Agent is secured using mTLS mechanism. This is already setup automatically.
If Caddy is disabled, all the ports need to be opened again, to allow Manta Data Lineage to function correctly.
End-Users Browser Bookmarks
When Caddy is enabled all the traffic is directed through it. This means that all the services are available on new URLs. Caddy integration is setup in such a way that only the port and protocol have been changed. For example if you have previously
accessed Manta Flow Server on the URL
http://localhost:8080/manta-flow-viewer
then the new URL will be https://localhost:8888/manta-flow-viewer
, i.e. the protocol has been changed from http://
to https://
and the port has been changed
from 8080 to 8888 (this assumes the default ports are configured, in case you are using different port numbers or host name, change the URLs accordingly).
Using Caddy to hide port numbers from URLs
If default port numbers are used all the URLs in the Manta Data Lineage will have the port number attached e.g.
https://localhost:8888/manta-flow-viewer
. The port number is required to be available, the only exceptions are the ports 80 (plain HTTP) and 443 (HTTPS). Those ports are not required to be part of the URL.
To run the Caddy on the port 443, following changes are required in the Manta configuration file ${installdir}/conf/manta.properties
:
Default property value | Updated property value |
---|---|
manta.caddy.port=8888 | manta.caddy.port=443 |
manta.keycloak.public.url=https://localhost:8888/auth |
manta.keycloak.public.url=https://localhost/auth |
manta.launcher.url=https://localhost:8888 |
manta.launcher.url=https://localhost |
Working around the port number restrictions on Linux
As explained above, on Linux OS the port 443 is restricted, and only user with root privileges is allowed to use it. Given the architecture of Manta Data Lineage, this means that the Launcher has to be started under root
account. This
means that the whole application is running as root
. From security perspective, this approach is not preferred. Strictly only Caddy needs to run with root
privileges, and the rest of the components can be running under
more restricted user.
Option #1
Run Manta Launcher with root
. Easiest solution, but least secure.
Option #2
Edit each startup.sh
script, for each component except Caddy and Launcher, and add following line:if [ $UID -eq 0 ]; then exec runuser -u $USER "$0" -- "$@"; fi
like this:
#!/bin/bash
if [ $UID -eq 0 ]; then exec runuser -u $USER "$0" -- "$@"; fi
echo "Starting Manta Keycloak"
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
Replace $USER
with actual username under which is the component supposed to run.
Then start Manta Launcher as a root
.
This way only Caddy and Launcher will be running under root user, the rest of the product will run under the selected user.
Safer than #1 and more convenient, but Manta Launcher still runs as root.
Option #3
Edit the file
$installdir/launcher/manta-launcher-dir/conf/applications.json
and change the Keycloak dependencies entry. Switch the "caddy"
dependency from STARTUP
to RUNTIME
.
{
"id": "keycloak",
"name": "Keycloak",
...
"dependencies": [
{
"application": "launcher",
"type": "STARTUP"
},
{
"application": "caddy",
"type": "RUNTIME" <---------- WAS STARTUP, CHANGE TO RUNTIME
}
],
...
Start Launcher as a regular user. This way the Launcher will try to start Caddy which will fail. This is expected, as the regular user cannot use the port 443. Keep Manta Launcher running and start Caddy manually under the root user like this:sudo $installdir/caddy/bin/startup.sh
Safest option, only Caddy runs under root user. Least convenient method, as the Caddy server has to be started manually, without the help of Manta Launcher.