Enabling high availability for Ceph Management gateway

Ensure continuous access to Ceph management tools, such as the Ceph Dashboard, Prometheus, Grafana, and Alertmanager, by enabling high availability (HA) for the Ceph Management gateway (mgmt-gateway).

Before you begin

Before you begin, make sure that you have the following prerequisites in place:
  • A subset of hosts have the mgmt label set on them. These are the hosts where the daemons are deployed.

    Use the ceph orch host ls command to see which hosts have the mgmt labels set.

  • Have an available Virtual IP (VIP).

About this task

By deploying multiple mgmt-gateway instances in an active/standby setup, the system can automatically switch to a backup instance if there is failure. Change to a backup instance by using the keepalived for failover. The oauth2-proxy service operates statelessly, with nginx as a load balancer to evenly distribute traffic. They following are the key components of HA for mgmt-gateway:
Keepalived
Provides failover support.
OAuth2-Proxy
Manages authentication.
Nginx
Runs as a load balancer and reverse-proxy for Ceph Management stack services.
Virtual IP (VIP)
Helps ensure smooth access by other external and internal services to the mgmt-gateway.

Enable Enabling HA to help ensure that tool access is always available, even during a component failure. Enable high availability service either with the cephadm CLI commands or by using a service specification file.

After deploying the mgmt-gateway service, direct access to services like Prometheus, Grafana, and Alertmanager is no longer allowed. These services are now accessible only through the Ceph Dashboard by the links that are provided in Administration > Services.

Note: To enable high availability (HA) for the management gateway, the ingress service must also be configured—this is demonstrated in the comprehensive example provided. Without ingress, the virtual IP (VIP) will not function correctly.

Enabling HA for Ceph Management gateway with the command-line interface

Procedure

Deploy the mgmt-gateway service in a high-availability configuration.
Note: To enable Single Sign-On (SSO), configure and deploy an OAuth2-proxy service. For more information about OAuth2-proxy service, see Using the OAuth2 Proxy (oauth2-proxy) service
To enable SSO with the oauth2-proxy, the --enable-auth=true parameter is mandatory.
ceph orch apply mgmt-gateway --virtual_ip VIP --enable-auth=true --placement="label:mgmt"
For example,
[ceph: root@host01 /]# ceph orch apply mgmt-gateway --virtual_ip 192.168.100.220 --enable-auth=true --placement="label:mgmt"

What to do next

Verify that the service is deployed, as expected.
  1. Run the ceph orch ls command to get the service status.
  2. Run the ceph orch ps command to get the status of the corresponding daemons.

Enabling HA for Ceph Management gateway with a service specification file

Before you begin

Before enabling the Ceph Management gateway, be sure that the following are on each Ceph node that the mgmt-gateway service will run on.
  • The port for gateway service use.
  • A running IBM Storage Ceph cluster.
  • (Optional) SSL protocols and ciphers for secure communication.
  • (Optional) SSL certificates and private key data for secure connections.

For more information about SSL protocols, ciphers, certificates, and certificate keys, see the Deploying web servers and reverse proxies in the Red Hat Enterprise Linux documentation.

Procedure

  1. Create a YAML file for the mgmt-gateway service.
    For example,
    [root@host01 ~]# touch mgmt-gateway.yaml
  2. Edit the YAML file to include the following details.
    Important: The following fields are optional: ssl_protocols, ssl_ciphers, and ssl_cert. When omitted, the mgmt-gateway service uses a safe and secure configuration. Change these fields with care, as they can compromise the security of your system.
    service_type: mgmt-gateway   
    placement:   
      hosts:  - ceph-node-1   
    spec:
      enable_auth: true   
      virtual_ip: 192.168.100.220   
      ssl_protocols:
        - TLSv1.3 
      ssl_ciphers:
        - AES128-SHA
        - AES256-SHA   
      ssl_cert: | 
        -----BEGIN CERTIFICATE-----
        < YOUR CERT DATA HERE >
        -----END CERTIFICATE-----   
      ssl_key: |
        -----BEGIN PRIVATE KEY-----
        < YOUR PRIVATE KEY DATA HERE >
        -----END PRIVATE KEY-----  
    Important: SSL-related fields such as ssl_protocols, ssl_ciphers, and ssl_cert are optional but recommended for secure configurations. If omitted, default secure settings are used. However, incorrect modifications can weaken your system's security. Ensure that any changes are in line with current security best practices.
    Table 1 lists fields that are specific to the mgmt-gateway service section of the spec file (ceph.deployment.service_spec.MgmtGatewaySpec).
    Table 1. mgmt-gateway specific fields in the spec file
    Field Description
    disable_https Is a flag to disable HTTPS. If True, the server uses unsecure HTTP.
    enable_auth Is a flag to enable SSO auth. Requires oauth2-proxy to be active for SSO authentication.
    networks A list of network identities that instruct the daemons to only bind on the particular networks in that list. In case the cluster is distributed across multiple networks, you can add multiple networks.
    placement For the orchestrator to deploy a service, it needs to know where to deploy daemons, and how many to deploy. This is the role of a placement specification. Placement specifications can either be passed as command line arguments or in a YAML files. For more information, see Managing services.
    virtual_ip The high availability virtual IP number on which the server will reside.
    server_tokens Flag control server tokens in responses: on, off, build, string.
    ssl_cert A multi-line string that contains the SSL certificate.
    ssl_key A multi-line string that contains the SSL key.
    ssl_ciphers List of supported secure SSL ciphers. Changing this list can reduce system security.
    ssl_prefer_server_ciphers Prefer server ciphers over client ciphers: on, off.
    ssl_protocols A list of supported SSL protocols (as supported by nginx).
    ssl_session_cache Duration that an SSL/TLS session is cached: off, none, [builtin[:size]], [shared:name:size].
    ssl_session_tickets A multi-option flag to control session tickets: on, off.
    ssl_session_timeout The duration for SSL session timeout. Syntax: time (for example, 5m).
    ssl_stapling Flag to enable or disable SSL stapling: on, off.
    ssl_stapling_verify Flag to control verification of SSL stapling: on, off.
  3. Apply the specification file.
    [root@host01 ~]# ceph orch apply -i mgmt-gateway.yaml
    Applying the specification file applies the HA configuration including the virtual IP and optional SSL.

What to do next

Verify that the service has been deployed, as expected.
  1. Run the ceph orch ls command to get the service status.
  2. Run the ceph orch ps command to get the status of the corresponding daemons.