subscribe iconSubscribe to this information
POWER6 information

Order of installation

Use this information to learn the tasks required for installing a new cluster.

The Installation coordination work sheet includes a sample work sheet to help you coordinate tasks among installation teams and members.

This information provides a high-level outline of the general tasks required to install a new cluster. If you understand the full installation flow of a new cluster, you can identify the tasks that can be performed when you expand your InfiniBand cluster network. Tasks such as adding InfiniBand hardware to an existing cluster, adding host channel adapters (HCAs) to an existing InfiniBand network, and adding a subnet to an existing network are described. To complete a cluster installation, all devices and units must be available before you begin installing the cluster.

The following are the fundamental tasks that are required for installing a cluster:
  1. The site is set up with power, cooling, floor space requirements, and floor load requirements.
  2. The switches and processing units are installed and configured.
  3. The management subsystem is installed and configured.
  4. The units are cabled and connected to the service virtual local area network (VLAN).
  5. The units can be verified and discovered on the service VLAN.
  6. The basic unit operation is verified.
  7. The cabling for the InfiniBand network is connected.
  8. The InfiniBand network topology and operation is verified.
Figure 1 shows a breakdown of the tasks by major subsystem. The following list illustrates the preferred order of installation by major subsystem. The order minimizes potential problems with having to perform recovery operations as you install, and also minimizes the number of reboots of devices during the installation.
  1. Management consoles and the service VLAN
    Note: Management consoles include the Hardware Management Console (HMC) and the servers running Cluster Systems Management (CSM), as well as a fabric management server.
  2. Servers in the cluster
  3. Switches
  4. Switch cable installation

By breaking down the installation by major subsystem, you can see how to install the units in parallel, or how you might be able to perform some installation tasks for on-site units while waiting for other units to be delivered.

It is important that you recognize the key points in the installation where you cannot proceed with one subsystem's installation task before completing the installation tasks in the other subsystem. These are called merge points, and are illustrated by using the inverted triangle symbol in Figure 1.

The following items are some of the key merge points.
  1. The management consoles must be installed and configured before starting to cable the service VLAN. This allows proper Dynamic Host Configuration Protocol (DHCP) management of the IP addressing on the service VLAN. Otherwise, the addressing might be compromised. This is not as critical for the fabric management server. However, the fabric management server must be operational before the switches are started on the network.
  2. You must power on the InfiniBand switches and configure their IP addresses before connecting them to the service VLAN. If this is not done, you must power them on individually and change their addresses by logging in to each of them by using their default address.
  3. If you have 12X host channel adapters (HCAs) connected to 4X switches, you must power on switches and cable them to their ports and configure the 12X groupings before attaching cables to HCAs in servers that have been powered on to standby mode or beyond. This action allows automatic negotiation to the 12X adapters by the HMCs to occur smoothly. When powering up the switches, it is not guaranteed that the ports will become operational in an order that makes the link appear as 12X to the HCA. Therefore, you must be sure that the switch is correctly cabled, configured, and ready to negotiate to the 12X adapters before starting the adapters.
  4. To fully verify the InfiniBand network, the servers must be fully installed in order to send data and run tools that are required to verify the network. The servers must be powered on to standby mode for topology verification.
    Note: With QLogic switches, you can use the Fast Fabric Toolset to verify topology. Alternatively, you can use the Chassis Viewer and Fabric Viewer.
Figure 1. High-level cluster installation flow
High-level cluster installation flow
Important: In each task box of Figure 1, there is also an index letter and number. These indexes indicate the major subsystem installation tasks, and you can use them to cross-reference to the following descriptions.

The task indexes are listed before each of the following major subsystem installation items:

U1: Set up the site for power and cooling, including proper floor cutouts for cable routing.

M1, S1, W1: Place units and frames in their correct positions on the data center floor. This includes, but is not limited to, HMCs, CSM Management Servers, fabric management servers, cluster servers (with HCAs, I/O devices, and storage devices), and InfiniBand switches. You can physically place units on the floor as they arrive. However, do not apply power or cable units to the service VLAN or to the InfiniBand network until instructed to do so.

Management console installation steps M2 - M4 have multiple tasks associated with each of them. Review the details in Installing and configuring the management subsystem to determine where you can assign different people to those tasks that can be performed simultaneously.

M2: Perform the initial management console installation and configuration. This includes HMCs, CSM, fabric management server, and DHCP service for the service VLAN.
Important:
  • If these devices and associated services are not set up correctly before applying power to the base servers and devices, you might not be able to correctly configure and control cluster devices. Furthermore, if this setup is done out of sequence, the recovery procedures for doing this part of the cluster installation can be lengthy.
  • When a cluster requires multiple HMCs, CSM is required to help manage device discovery. In this case, the setup of CSM and the peer domains on the Cluster-Ready Hardware Server are critical to achieving correct cluster device discovery. It is also important to have a central DHCP server, which must be on the same server as CSM.

M3: Connect server hardware control points to the service VLAN as instructed by server installation documentation. The location of the connection is dependent on the server model and might involve a connection to the bulk power controllers (BPCs) or might be directly attached to the service processor. Do not attach switches to the cluster VLAN at this time.

Also, attach the management consoles to the service and cluster VLANs.
Note: Switch IP addressing must be static. Each switch comes up with the same default address; therefore, you must set the switch address before it is added to the service VLAN. Otherwise, you must bring the switches one at a time onto the service VLAN and assign a new IP address before bringing the next switch onto the service VLAN.

M4: Do the portion of final management console installation and configuration that involves assigning or acquiring servers to their managing HMCs and authenticating frames and servers through Cluster-Ready Hardware Server (CRHS). This action is only required when you are using CSM and CRHS.

Note: The double arrow between M4 and S3 indicates that these two tasks cannot be completed independently. As the server installation portion of the flow is completed, then the management console configuration can be completed.

Set up remote logging and remote command processing and verify these operations.

When M4 is complete, the bulk power assemblies (BPAs) and cluster service processors must be at power standby state. To be at the power standby state, the power cables for each server must be connected to the appropriate power source. Prerequisites for M4 are M3, S2, and W3; the corequisite for M4 is S3.

The following server installation and configuration operations (S2 - S7) can be performed sequentially after step M3 is performed.

M3

This is in the management subsystem installation flow (left column of Figure 1), but the tasks are associated with the servers. Attach the cluster server service processors and BPAs to the service VLAN. This task must be done before connecting power to the servers, and after the management consoles are configured, so that the cluster servers can be discovered correctly.

S2

To bring the cluster servers to the power standby state, connect the servers in the cluster to their appropriate power sources. Prerequisites for S2 are M3 and S1.

S3

Verify that the management consoles discovered the cluster servers.

S4

Update the system firmware.

S5

Verify the system operation. Use the server installation information to verify that the system is operational.

S6

Customize logical partition and HCA configurations.

S7

Load and update the operating system.

Complete the following switch installation and configuration tasks W2 - W6.

W2

Power on and configure the IP address of the switch Ethernet connections. This must be done before attaching the switch to the service VLAN.

W3

Connect switches to the cluster VLAN. If there is more than one VLAN, all switches must be attached to a single cluster VLAN, and all redundant switch Ethernet connections must be attached to the same network. Prerequisites for W3 are M3 and W2.

W4

Verify discovery of the switches.

W5

Update the switch software.

W6

Customize InfiniBand network configuration.

Complete C1 - C4 for cabling the InfiniBand network.

Note: It is possible to cable and start networks other than the InfiniBand networks before cabling and starting the InfiniBand network.
Important: When you attach InfiniBand cables between switches and HCAs, connect the cable to the switch end first.
C1

Route cables and attach cables ends to the switch ports. Apply labels at this time.

C2

If 12X HCAs are connecting to 4X switches and the links are being configured to run at 12X instead of 4X, the switch ports must be configured in groups of three 4X ports to act as a single 12X link. If you are configuring links at 12X, go to C3. Otherwise, go to C4.

Prerequisites for C2 are W2 and C1.

C3

Configure 12X groupings on switches. This must be done before attaching HCA ports. Assure that switches remain powered on before attaching HCA ports.

The prerequisite is a Yes to decision point C2.

C4

Attach the InfiniBand cable ends to the HCA ports.

The prerequisite is either a No decision in C2 or if the decision in C2 was Yes, then C3 must be done first.

Complete V1 - V3 to verify the cluster networking topology and operation.

V1

This task involves checking the topology by using QLogic Fast Fabric tools. There might be alternative methods for checking the topology. Prerequisites for V1 are M4, S7, W6, and C4.

V2

You must also check for serviceable events reported to the HMC. Furthermore, an all-to-all ping is suggested to test the InfiniBand network before putting the cluster into operation. A vendor might have an alternative method for verifying network operation. However, you should consult the HMC, and resolve any open serviceable events. If a vendor has discovered and resolved a serviceable event, then the serviceable event must be closed. The prerequisite for V2 is V1.

V3

You might have to contact service numbers to resolve problems after service representatives leave the site.


Send feedback | Rate this page

Last updated: Tue, February 08, 2011