Updating the system software

The system update process involves the updating of your entire system environment.

Planning considerations

Consult with your Technical Advisor (if applicable) and these web pages to determine if an update is warranted:
Take a week to plan your update tasks before running the update on your system environment. The update procedures can be divided into the pre-upgrade checklist and general processes that are shown in Table 1 and Table 2.
Table 1. Pre-upgrade checklist
Sequence Pre-upgrade checklist
1

To obtain a new release of code for a system update, access the following site:

http://www.ibm.com/support

You can also get support on choosing the correct release from the following site:

https://www.ibm.com/support/pages/node/690527

2 Ensure that all the error logs are fixed and the system date and time are correctly set and marked as fixed.
3 Ensure that all the node canisters have Service IPs assigned and ethernet cables are plugged into the port 1 of each node.
4 Run upgrade test utility software before the scheduled upgrade date.
5 Ensure that all the host multipathing drivers are at the supported levels and correctly configured. Review the timeout settings at the host and application levels.
6 Ensure that your system meets the disk space and memory requirements to be able to complete the upgrade package. You can increase the disk space and memory by the following methods:
  • Delete the old upgrade package through GUI.
  • Run the command to clear the upgrade directory on each system
    cleardumps -prefix /home/admin/upgrade
Table 2. Updating tasks
Sequence Update task
1 Before you update, become familiar with the prerequisites and tasks involved. During the automatic update procedure, the system updates each of the nodes systematically. Decide whether you want to update automatically or update manually. During an automatic update procedure, the system updates each of the nodes systematically. The automatic method is the preferred procedure for updating software on nodes. However, you can also update each node manually.
2 Ensure that CIM object manager (CIMOM) clients are working correctly. When necessary, update these clients so that they can support the new version of system code.
3 Ensure that the multipathing drivers in the environment are fully redundant.
4 To upload the support package (snap type) automatically, see Uploading support packages automatically.
5 Update your system. The system update includes component firmware updates. The drive firmware update is a separate process.
6 Update other devices in the system environment. Examples might include updating hosts and switches to the correct levels.
Note: The amount of time can vary depending on the amount of preparation work that is required and the size of the environment. Generally allow approximately 1 hour per node for an update. A manual update takes longer time.
Attention: If you experience failover issues with multipathing driver support, resolve these issues before you start normal operations.

Firmware and software for the system and its attached adapters are tested and released as a single package. The package number increases each time that a new release is made.

Some code levels support updates only from specific previous levels, or the code can be installed only on certain hardware types. If you update to more than one level higher than your current level, you might be required to install an intermediate level. For example, if you are updating from level 1 to level 3, you might need to install level 2 before you can install level 3. For more information about the prerequisites for each code level, see this website:

www.ibm.com/support

The update process

During the automatic update process, each node in a system is updated one at a time, and the new code is staged on the nodes. While each node restarts, some degradation in the maximum I/O rate can be sustained by the system. After all the nodes in the system are successfully restarted with the new code level, the new level is automatically committed. During commit, there can be a short impact on performance.

During an automatic code update, each node of a working pair is updated sequentially. The node that is being updated is temporarily not available and all I/O operations to that node fail. As a result, the I/O error counts increase and the failed I/O operations are directed to the partner node of the working pair. Applications do not see any I/O failures. When new nodes are added to the system, the update package is automatically downloaded to the new nodes from the system.

The following table shows the estimated time that is required to update a four node system. In this example, nodes A1 and A2 are in I/O group 1 and nodes B3 and B4 are in I/O group 2:
Table 3. Upgrade time for a four-node cluster
Steps Time (in minutes) Minimum time Maximum time
Get ready 1 minute 1 minute 1 minute
Upgrade node A2 9-24 minutes 10 minutes 25 minutes
Upgrade node B4 9-24 minutes 19 minutes 49 minutes
Wait for 30 minutes (for multi-pathing recovery) 30 minutes 49 minutes 1 hour 19 minutes
Upgrade node A1 9-24 minutes 58 minutes 1 hour 43 minutes
Upgrade node B3 9-24 minutes 1 hour 7 minutes 2 hours 7 minutes
Total - 1 hour 7 minutes 2 hours 7 minutes

The update can normally be done concurrently with normal user I/O operations. However, performance might be impacted. If any restrictions apply to the operations that can be done during the update, these restrictions are documented on the product website that you use to download the update packages. During the update procedure, most of the configuration commands are not available. Only the following commands are operational from the time the update process starts to the time that the new code level is committed, or until the process is backed out:

  • All information commands
  • The rmnodecanister command

To determine when your update process completes, you are notified through the management GUI. If you are using the command-line interface, use the lsupdate command to display the status of the update. If there is a problem with the update, the update enters a stalled state. Contact customer support immediately to diagnose the issue. Depending on the issue, with support help the update can be either aborted, backed off, or resumed.

Because of the operational limitations that occur during the update process, the code update is a user task. If there is a problem with the update, the update enters a stalled state. Contact customer support immediately to diagnose the issue. Depending on the issue, with support help the update can be either ended, backed off, or resumed. Do not try to troubleshoot update problems without technical assistance. For more information, see the topic about how to get information, help, and technical assistance.

Multipathing driver

Before you update, ensure that the multipathing driver is fully redundant with every path available and online. You might see errors that are related to the paths that are going away (fail over) and the error count increasing during the update. When the paths to the nodes are back, the nodes fall back to become a fully redundant system. After the 30-minute delay, the paths to the other node go down.

Metro Mirror and Global Mirror relationships

When you update software on a system that has primary or secondary volumes of running Metro Mirror or Global Mirror relationships, write performance might be degraded on the primary volumes, and Global Mirror relationships can be automatically stopped with one or more errors with error code 1920. You might want to proactively stop such relationships or consistency groups or the partnership before you update the software to avoid the write performance degradation, and restart the relationships after the update completes.