Safe Cluster Restart Automation Guidelines
This document provides guidelines for automating the cluster restart procedure. This is useful when a simple restart of the cluster is needed.
Note that these guidelines apply to restarting a healthy cluster only, one where all servers in the cluster are up and there is only one active server per stripe.
Cluster restart automation
- STEP 1. Shut down the clients.
- STEP 2. Restart the passive servers in safe mode.
- STEP 3. Restart the active servers in safe mode.
- STEP 4. Make the previous active servers exit from safe mode.
- STEP 5. Make the previous passive servers exit from safe mode.
The details for these steps are as follows:
-
STEP 1. Shut down the clients
The Terracotta client will shut down when you shut down your application.
-
STEP 2. Restart the passive servers in safe mode
Use the
stop-tc-server
script with the options--stop-if-passive
and--restart-in-safe-mode
to ensure that a server only restarts if it is in passive mode.Use a procedure indicated by the following pseudocode for shutting down all passive servers:
for each <server> in <running servers> { stop-tc-server --stop-if-passive --restart-in-safe-mode <server> <args> }
Wait for the passive servers to reach SAFE_MODE_STATE. The server state can determined by using the
server-stat
script.See the section Server Status (server-stat) in the Administration Guide for related information.
-
STEP 3. Restart the active servers in safe mode
Use the
stop-tc-server
script with the options--stop-if-active
and--restart-in-safe-mode
. This restarts a server only if it is in active mode.Use a procedure indicated by the following pseudocode for shutting down all active servers:
for each <server> in <running servers> { stop-tc-server --stop-if-active --restart-in-safe-mode <server> <args> }
Wait for all servers to reach SAFE_MODE_STATE. The server state can determined by using the
server-stat
script. -
STEP 4. Make previous active servers exit from safe mode
Use the
exit-safe-mode
script to make a server exit from safe mode.See the section Exit Safe Mode (exit-safe-mode) in the Administration Guide for related information.
All previous active servers can be determined using the
server-stat
script. Theserver-stat
script provides the state of a server prior to shutdown in theinitialState
field.Use a procedure indicated by the following pseudocode to make previous active servers exit from safe mode:
<previous active servers> = [] for each <server> in <servers> { server-stat -s <server>:<management-port> add <server> to <previous active servers> if the initialState is Active state } for each <server> in <previous active servers> { exit-safe-mode -s <server>:<management-port> }
-
STEP 5. Make previous passive servers exit from safe mode
Use a procedure indicated by the following pseudocode to make previous passive servers exit from safe mode:
for each <server> in <previous passive servers> { exit-safe-mode -s <server>:<management-port> }