Considerations for performing your own rolling update of a Native HA queue manager
Any update to the IBM® MQ version or Pod specification for a Native HA queue manager, will require you to perform a rolling update of the queue manager instances. The IBM MQ Operator handles this for you automatically, but if you are building your own deployment code, then there are some important considerations.
In Kubernetes, StatefulSet
resources are used to manage ordered start-up and rolling updates. Part of the start-up
procedure is to start each Pod individually, wait for it to become ready, and then move
onto the next Pod. This won't work for Native HA, as all Pods need to be started so that
they can run a leader election. Therefore the .spec.podManagementPolicy
field on the StatefulSet needs to be set to Parallel.
This also means that all Pods will be updated in parallel too, which is particularly
undesirable. For this reason, the StatefulSet should also use the
OnDelete update strategy.
StatefulSet rolling update code drives a need for
custom rolling update code, which should consider the following: - General rolling update procedure
- Minimizing down time by updating Pods in the best order
- Handling changes in cluster state
- Handling errors
- Handling timing problems
General rolling update procedure
The rolling update code should wait for each instance to show a status of
REPLICA from dspmq. This means that the
instance has performed some level of start up (for example, the container is
started, and MQ processes are running), but it has not necessarily managed to
talk to the other instances yet. For example: Pod A gets restarted, and as soon
as it's in REPLICA state, Pod B gets restarted. Once Pod B
starts with the new configuration, it should be able to talk to Pod A, and can
form quorum, and either A or B will become the new active instance.
As part of this, it is useful to have a delay after each Pod has
reached the REPLICA state, to allow for it to connect to its peers and
establish quorum.
Minimizing down time by updating Pods in the best order
The rolling update code should delete Pods one at a time, starting with Pods which are in a known error state, followed by any Pods that have not successfully started. The active queue manager Pod should generally be updated last.
It is also important to pause the deletion of Pods if the last update resulted in a Pod going into a known error state. This prevents the roll-out of a broken update across all Pods. For example, this can happen if the Pod is updated to use a new container image which isn't accessible (or contains a typo).
Handling changes in cluster state
The rolling update code needs to react appropriately to real-time changes in cluster state. For example, one of the queue manager's Pods might be evicted due to a Node reboot or due to Node pressure. It's possible that an evicted Pod might not be immediately re-scheduled if the cluster is busy. In this case, the rolling update code would need to wait appropriately before restarting any other Pods.
Handling errors
The rolling update code needs to be robust to failures when calling the Kubernetes API and other unexpected cluster behaviour.
In addition, the rolling update code itself needs to be tolerant to being restarted. A rolling update can be long-running, and the code may need to be restarted.
Handling timing problems
The rolling update code needs to check the update revisions of the Pod, so that it can ensure the Pod has restarted. This avoids timing issues where a Pod may indicate that it is "Started", but it has in-fact not yet terminated.