Resolving a problem with a concurrent hardware upgrade

Follow these steps to resolve most problems that might occur when you complete a non-disruptive canister upgrade.

About this task

The following steps suggest solutions to common problems.

Procedure

A hardware fault occurs in one or more replacement canisters. The faulty canister boots into Service state and does not join the cluster.

  1. Remove the faulty canister and replace the original canister.
    When you reboot, the system rolls back to the previous known good state.
  2. Alternatively, if you have a replacement canister available, remove the faulty canister and replace it with the new one.
  3. Return the faulty canister for replacement or repair.

A software assert occurs during the upgrade.

  1. Remove the new canister or canisters and replace the originals.
    When you reboot, the system rolls back to the previous known good state.

A complete system failure occurs while the system is in a mixed state (containing both old and new canisters) and a T3 recovery is required.

  1. Remove the new canister or canisters and replace the originals. Do not attempt a T3 recovery with the system in a mixed state.
  2. Proceed with the T3 recovery process.

A replacement canister boots with error code 503, "Incorrect enclosure type". A replacement canister was shipped with code that was earlier than version 7.8, and the partner node failed to upgrade the software on the new canister.

  1. Manually trigger a node rescue.

What to do next

If the replacement canister does not boot then see Resolving a problem with failure to boot.