System maintenance
As you plan for system maintenance updates, review information about the supported paths, the availability of virtual machine instances during the updates, and the time required for the maintenance procedure to complete.
To review the maintenance paths that are supported for your current product version, see System updates.
Duration of the system update process
The non leader Platform System Manager is updated first, then the leader role is switched to newly updated Platform System Manager and the remaining update proceeds with the automated process. During switchover, the console is unavailable for approximately 60 minutes. Existing instances remain available for use, but new instances cannot be deployed.
When compute node updates are included in the fix pack, an outage can occur during the updates of the compute node when an instance evacuation is required and the cloud group does not have enough physical capacity for the evacuation to occur. The cloud group is not highly available and the updates cannot complete without affecting the instances. Some instances are stopped and cannot be restarted until the updates are complete and the cloud group resources are fully restored.
Instance availability
- Redundant hardware, such as networking, storage, and power supplies.
- No single points of failure for cloud groups with active high availability containing two or more compute nodes.
- Virtual machine instances remain available during system maintenance updates or hardware failures, leveraging reserved capacity and mobility actions within the system.
- Additional capacity can be added and used with no service interruption.
In order for a system to be highly available, all components must be highly available. Currently, the only component for which you can control its high availability mode is a cloud group. When a cloud group's high availability is active, the physical capacity is reserved to ensure that even during peak utilization the overall functionality and state of the system remains healthy while virtual machine instances are evacuated, during both system failures and updates. The amount of reserved physical capacity is determined by the cloud group type: dedicated or average.
Type | CPU count (1 vCPU) | Virtual memory (1 MB) |
---|---|---|
Dedicated | 0.9 pCPU per vCPU | 1 physical MB |
Average | 0.1125 pCPU per vCPU | 1 physical MB |
Optionally, a cloud group can be set to reserve resources for high availability. This option reserves resources (CPU and memory) within the cloud group equivalent to one compute node. The reserved capacity in a cloud group containing N compute nodes is 1 / N of the resources (CPU and memory) on each compute node.
If the Reserve resources for availability option is enabled, the evacuation of virtual machine instances from one compute node to another, if required, will always complete successfully without impacting the virtual machine instances because the required resources within the cloud group have been set aside in advance.
- When an evacuation is not required, the updates can complete without requiring the movement of virtual machines off of their existing compute nodes.
- When an evacuation is required and the cloud group has enough physical capacity for the evacuation to occur, as determined by the cloud group type, the updates can complete without impacting the running virtual machine instances.
- When an evacuation is required and the cloud group does not have enough physical capacity for the evacuation to occur, the updates cannot complete without affecting the virtual machine instances.