IBM Support

Terraform Enterprise Active/Active node fails to start with error "local node not active but active cluster node not found"

How To


Steps

Problem

In a Terraform Enterprise active-active configuration, a secondary node fails to start. The logs show the error local node not active but active cluster node not found.

Prerequisites

  • Terraform Enterprise configured for active-active operational mode.
  • A Replicated or Flexible Deployment Options (FDO) installation using Docker or Podman.

Cause

A time drift between the nodes in the cluster can cause this startup failure. The internal Vault component is unable to create a new token and fails to discover other active nodes in the cluster if the system clocks are not synchronized.

Procedure

Follow these steps to diagnose and resolve the issue.

Step 1: Diagnose the Issue

  1. Check the Terraform Enterprise container logs for the specific Vault token creation error. The command varies by version.

    For versions v202308-1 and older:

    $ docker logs tfe-vault

    For versions v202309-1 and newer (note: your container name may differ):

    $ docker exec -it terraform-enterprise-tfe-1 more /var/log/terraform-enterprise/vault.log

    Look for the following error message in the output:

    + Retrying to create vault token
    Error creating token: Error making API request.
    
    URL: POST http://tfe-vault:8200/v1/auth/token/create
    Code: 500. Errors:
    
    * local node not active but active cluster node not found
  2. Check the status of the internal Vault instance to confirm it is in standby mode.

    For versions v202308-1 and older:

    $ docker exec -it tfe-vault vault status

    For versions v202309-1 and newer (note: your container name may differ):

    $ docker exec -it terraform-enterprise bash -c 'VAULT_ADDR=http://127.0.0.1:8200 vault status'

    The output should indicate that the node is in standby mode and has not found an active node.

    ## ...
    HA Enabled true
    HA Cluster n/a
    HA Mode standby
    Active Node Address <none>
  3. Check the system time on all Terraform Enterprise nodes to identify any drift.

    $ date

Step 2: Resolve the Issue

  1. Correct the time drift on the affected nodes. The specific commands depend on your operating system, but most systems use a Network Time Protocol (NTP) service to synchronize clocks.
  2. Verify that the Vault cluster members can communicate over port 8201. Run these commands from different nodes.

    On an unhealthy node, start a listener on port 8201.

    $ nc -l $PRIVATE_IP_OF_UNHEALTHY_NODE 8201

    From a healthy node, attempt to connect to the unhealthy node's listener.

    $ nc -vz $PRIVATE_IP_OF_UNHEALTHY_NODE 8201

    A successful connection will produce the following output.

    Connection to $PRIVATE_IP_OF_UNHEALTHY_NODE 8201 port [tcp/*] succeeded!
  3. Perform a rolling restart of the Terraform Enterprise application, starting with the healthy node first.

    First, stop the application on all nodes.

    For Replicated deployments:

    ## On the healthy node
    $ tfe-admin node-drain
    $ replicatedctl app stop
    
    ## On the unhealthy node(s)
    $ replicatedctl app stop -f

    For Flexible Deployment Options with Docker:

    $ docker compose -f /path/to/docker-compose.yaml down

    For Flexible Deployment Options with Podman:

    $ podman kube down /path/to/podman_kube.yaml
  4. Start the Terraform Enterprise application, beginning with the healthy node. After starting, confirm the time is synchronized across all nodes using the date command.

    For Replicated deployments:

    $ replicatedctl app start

    For Flexible Deployment Options with Docker:

    $ docker compose -f /path/to/docker-compose.yaml up --detach

    For Flexible Deployment Options with Podman:

    $ podman play kube /path/to/podman_kube.yaml
  5. Continue the startup process for the remaining nodes.

Outcome

After resolving the time drift and restarting the application, all nodes in the Terraform Enterprise cluster should start successfully.

Additional Information

For more details on Terraform Enterprise architecture, you may find the official documentation on active-active installations helpful.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB77","label":"Automation Platform"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSGH5YK","label":"IBM Terraform Self-Managed"},"ARM Category":[{"code":"","label":""}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Historical Number

27888726725011

Document Information

Modified date:
16 March 2026

UID

ibm17265198