Use this information to learn the tasks required for installing a new cluster.
The Installation coordination work sheet includes a sample work sheet to help you coordinate tasks among installation teams and members.
This information provides a high-level outline of the general tasks required to install a new cluster. If you understand the full installation flow of a new cluster, you can identify the tasks that can be performed when you expand your InfiniBand cluster network. Tasks such as adding InfiniBand hardware to an existing cluster, adding host channel adapters (HCAs) to an existing InfiniBand network, and adding a subnet to an existing network are described. To complete a cluster installation, all devices and units must be available before you begin installing the cluster.
By breaking down the installation by major subsystem, you can see how to install the units in parallel, or how you might be able to perform some installation tasks for on-site units while waiting for other units to be delivered.
It is important that you recognize the key points in the installation where you cannot proceed with one subsystem's installation task before completing the installation tasks in the other subsystem. These are called merge points, and are illustrated by using the inverted triangle symbol in Figure 1.
The task indexes are listed before each of the following major subsystem installation items:
U1: Set up the site for power and cooling, including proper floor cutouts for cable routing.
M1, S1, W1: Place units and frames in their correct positions on the data center floor. This includes, but is not limited to, HMCs, CSM Management Servers, fabric management servers, cluster servers (with HCAs, I/O devices, and storage devices), and InfiniBand switches. You can physically place units on the floor as they arrive. However, do not apply power or cable units to the service VLAN or to the InfiniBand network until instructed to do so.
Management console installation steps M2 - M4 have multiple tasks associated with each of them. Review the details in Installing and configuring the management subsystem to determine where you can assign different people to those tasks that can be performed simultaneously.
M3: Connect server hardware control points to the service VLAN as instructed by server installation documentation. The location of the connection is dependent on the server model and might involve a connection to the bulk power controllers (BPCs) or might be directly attached to the service processor. Do not attach switches to the cluster VLAN at this time.
M4: Do the portion of final management console installation and configuration that involves assigning or acquiring servers to their managing HMCs and authenticating frames and servers through Cluster-Ready Hardware Server (CRHS). This action is only required when you are using CSM and CRHS.
Set up remote logging and remote command processing and verify these operations.
When M4 is complete, the bulk power assemblies (BPAs) and cluster service processors must be at power standby state. To be at the power standby state, the power cables for each server must be connected to the appropriate power source. Prerequisites for M4 are M3, S2, and W3; the corequisite for M4 is S3.
The following server installation and configuration operations (S2 - S7) can be performed sequentially after step M3 is performed.
| M3 | This is in the management subsystem installation flow (left column of Figure 1), but the tasks are associated with the servers. Attach the cluster server service processors and BPAs to the service VLAN. This task must be done before connecting power to the servers, and after the management consoles are configured, so that the cluster servers can be discovered correctly. |
| S2 | To bring the cluster servers to the power standby state, connect the servers in the cluster to their appropriate power sources. Prerequisites for S2 are M3 and S1. |
| S3 | Verify that the management consoles discovered the cluster servers. |
| S4 | Update the system firmware. |
| S5 | Verify the system operation. Use the server installation information to verify that the system is operational. |
| S6 | Customize logical partition and HCA configurations. |
| S7 | Load and update the operating system. |
Complete the following switch installation and configuration tasks W2 - W6.
| W2 | Power on and configure the IP address of the switch Ethernet connections. This must be done before attaching the switch to the service VLAN. |
| W3 | Connect switches to the cluster VLAN. If there is more than one VLAN, all switches must be attached to a single cluster VLAN, and all redundant switch Ethernet connections must be attached to the same network. Prerequisites for W3 are M3 and W2. |
| W4 | Verify discovery of the switches. |
| W5 | Update the switch software. |
| W6 | Customize InfiniBand network configuration. |
Complete C1 - C4 for cabling the InfiniBand network.
| C1 | Route cables and attach cables ends to the switch ports. Apply labels at this time. |
| C2 | If 12X HCAs are connecting to 4X switches and the links are being configured to run at 12X instead of 4X, the switch ports must be configured in groups of three 4X ports to act as a single 12X link. If you are configuring links at 12X, go to C3. Otherwise, go to C4. Prerequisites for C2 are W2 and C1. |
| C3 | Configure 12X groupings on switches. This must be done before attaching HCA ports. Assure that switches remain powered on before attaching HCA ports. The prerequisite is a Yes to decision point C2. |
| C4 | Attach the InfiniBand cable ends to the HCA ports. The prerequisite is either a No decision in C2 or if the decision in C2 was Yes, then C3 must be done first. |
Complete V1 - V3 to verify the cluster networking topology and operation.
| V1 | This task involves checking the topology by using QLogic Fast Fabric tools. There might be alternative methods for checking the topology. Prerequisites for V1 are M4, S7, W6, and C4. |
| V2 | You must also check for serviceable events reported to the HMC. Furthermore, an all-to-all ping is suggested to test the InfiniBand network before putting the cluster into operation. A vendor might have an alternative method for verifying network operation. However, you should consult the HMC, and resolve any open serviceable events. If a vendor has discovered and resolved a serviceable event, then the serviceable event must be closed. The prerequisite for V2 is V1. |
| V3 | You might have to contact service numbers to resolve problems after service representatives leave the site. |