Contents


Blades and external storage: Set up a fault-tolerant environment

How to set up an IBM BladeCenter HS21/LS21 server to boot from external DS3200 SAS storage

Comments

This article shows you how to boot from SAS storage (in this case, the IBM System Storage DS3200) using the IBM BladeCenter-H and x86-based BladeCenter HS21 and LS21 servers. (For BladeCenter-E, check the blade compatibility matrix listed below in the Related topics section to see if your environment is supported.) You will also see how to enable multipathing in Linux. Multipathing, or multipath I/O, is a fault-tolerance and performance enhancement technique where more than one physical path exists between the CPU and its mass storage devices. Simple examples are a SCSI disk connected to two SCSI controllers on the same computer or a disk connected to two Fibre Channel ports.

The major advantage of using external storage only is the improved availability of the server. Even though you can configure several different RAID levels using local disks within the blade only, you must shut down the system to replace a defective drive. Using external storage and hot-swap drives, you can replace defective drives without interrupting services. Recently, IBM announced blade servers, the BladeCenter HS12 and JS12 servers, with support for hot-swap SAS drives, but both are uniprocessor machines only.

Here are the six steps:

  1. Prepare the BladeCenter server and collect information on SAS modules and the DS3200.
  2. Prepare the storage subsystem.
  3. Configure the SAS BIOS.
  4. Install the operating system.
  5. Enable multipathing.
  6. Extend the file system with additional LUNs.

Step 1: Prepare BladeCenter server and collect information

Make sure that the SAS Switching Modules are installed correctly and connected to both DS3200 controllers. You should assume that controller A (left side when looking from the back) is connected to the SAS switch in bay 3 and controller B is connected to the SAS switch in bay 4. It doesn't matter into which port the cables are plugged and as long as you don't change the default zoning in the BladeCenter Management Module, each blade has access to all external ports.

By default, all external ports are disabled. You can enable them in two ways -- via Web or command-line interface.

Via Web interface. Login to the Web interface of your Management Module and click on Admin/Power/Restart in the I/O Module section on the left. Select the third I/O module, enable the external ports, and click on the Save button on the right.

Figure 1. Enable external ports
Enable external ports

Via command-line interface. Login via telnet or SSH to your Management Module. Issue the following command

Listing 1. Enable external switch ports
system> ifconfig -ep enabled -T switch[3]
system> ifconfig -ep enabled -T switch[4]

to enable external ports on switch bay 3 and 4, the two SAS switches.

To map a storage partition to the blade, you need the WWN (world wide name) of the SAS daughter card. It uniquely defines the interface in a SAS domain. Collect the WWN either through the Web or the command line.

Via Web interface. In the Management Module Web interface click on Hardware VPD on the left pane.

Figure 2. Collecting the SAS WWN
Collecting the SAS WWN
Collecting the SAS WWN

Via command-line interface. Login via telnet or SSH to your Management Module. You can list the available blades with a list -l 2. To include installed daughter cards in your list, issue:

Listing 2. Get detailed inventory information
system> list -l 3

Collect the WWN with the info command on the daughter card of your blade slot (in this example slot 4).

Listing 3. Get SAS WWN
system> info -T blade[4]:exp[1]

Manufacturer: LSI (Not Available)
Manufacturer ID: 20301
Product ID: 118
Mach type/model: SAS Expansion Option
Mach serial number: Not Available
Manuf date: 02/08
Hardware rev: 3
Part no.: 39Y9187
FRU no.: 39Y9188
FRU serial no.: YK105481E006
CLEI: Not Available
SAS ID 1: 50:00:62:b0:00:0b:26:24
SAS ID 2: 50:00:62:b0:00:0b:26:25
SAS ID 3: Not Available
SAS ID 4: Not Available
SAS ID 5: Not Available
SAS ID 6: Not Available
SAS ID 7: Not Available
SAS ID 8: Not Available
MAC Address 1: Not Available
MAC Address 2: Not Available
MAC Address 3: Not Available
MAC Address 4: Not Available
MAC Address 5: Not Available
MAC Address 6: Not Available
MAC Address 7: Not Available
MAC Address 8: Not Available

SAS ID 1 and SAS ID 2 show the WWN of port 1 (mapped to switch bay 3) and port 2 (mapped to switch bay 4).

Step 2: Prepare storage subsystem

Define storage units and map as a LUN to the blade. (A logical unit number is simply the number assigned to a logical unit. A logical unit is a SCSI protocol entity, the only one which may be addressed by the actual I/O operations.) You can use either the DS3000 Storage Manager or the DS4000 Storage Manager Software. Initial configuration is done via the Configure tab; changes go via the Modify tab.

Figure 3. DS3000 Storage Manager
DS3000 Storage Manager
DS3000 Storage Manager

In the Storage Manager, Configure Host Access (Manual) lets you define the blade and associate the two WWNs of the SAS daughter card with the blade. Make sure to select LNXCLVMWARE as the host type. This disables AVT (Automatic Volume Transfer) and is required when using the RDAC driver under Linux.

Now allocate some storage for your blade with the Create Logical Drives menu. After that, map the logical drive to a blade with the obvious Create Host-to-Logical Drive Mappings menu. Make sure to map as LUN 0 (and don't introduce any "holes" in the numbering -- some environments stop scanning after finding an unused number).

Step 3: Configure SAS BIOS

To allow for a clean install to and boot from external SAS storage, you need to ensure the correct settings in the blade BIOS as well as in the SAS daughter card BIOS. Recent releases of x86 blades (later than mid 2008) work correctly right out of the box; earlier ones might need some changes.

Verify in the LSI configuration utility that the daughter card is enabled and included in the boot list, preferably as the first boot device. SAS1064 is the daughter card, SAS1064E the onboard controller for this blade.

Figure 4. SAS BIOS settings
SAS BIOS settings
SAS BIOS settings

Also, make sure that the daughter card properties have a Boot Support value of Enabled BIOS & OS and that the BIOS scans for all LUNs. You can verify this by choosing Advanced Adapter Properties then Advanced Device Properties. If you're unsure, restore the default settings.

In the blade BIOS, make sure you haven't disabled the daughter card slot.

Figure 5: Blade BIOS settings
Blade BIOS settings
Blade BIOS settings

Depending on the preferred path of your logical unit (controller A or B), you may either set hd0 or hd1 as the blade's first boot device. If you set it on the management module (like this)

Listing 4. Set boot list for blade in slot X
system> bootseq hd0 hd1 -T blade[X]

and you get an error I9990301 Disk failure or disk reset failed then you have to switch boot sequence and make hd1 the first boot device. In our setup, controller A goes to I/O bay 3 which maps to hd0. Controller B connects to I/O bay 4 and maps to hd1.

Step 4: Install OS

When installing the Red Hat Enterprise Linux 5.2 operating system, the installer will see two devices called /dev/sda and /dev/sdb. Even though it actually is the same disk space, the installer isn't aware of the redundancy aspect of the path and thus shows the same LUN twice. This doesn't in any way prevent your system from working correctly since you install to /dev/sda only. You will see error messages for /dev/sdb which you can safely ignore by pressing cancel.

Figure 6. OS installation error
OS installation error
OS installation error

After installation finished, you get I/O errors for /dev/sdb. Check dmesg or have a look into /var/log/messages. The only way to get rid of those error messages is to install a multipath driver.

Listing 5. I/O errors due to missing multipath driver
Jul  1 19:47:26 localhost kernel: Buffer I/O error on device sdb, logical block 13107184
Jul  1 19:47:26 localhost kernel: end_request: I/O error, dev sdb, sector 104857472
Jul  1 19:47:26 localhost kernel: Buffer I/O error on device sdb, logical block 13107184
Jul  1 19:47:27 localhost kernel: end_request: I/O error, dev sdb, sector 0
Jul  1 19:47:27 localhost kernel: Buffer I/O error on device sdb, logical block 0
Jul  1 19:47:27 localhost kernel: Buffer I/O error on device sdb, logical block 1
Jul  1 19:47:27 localhost kernel: Buffer I/O error on device sdb, logical block 2
Jul  1 19:47:27 localhost kernel: Buffer I/O error on device sdb, logical block 3
Jul  1 19:47:27 localhost kernel: end_request: I/O error, dev sdb, sector 0
Jul  1 19:47:28 localhost kernel: end_request: I/O error, dev sdb, sector 2

Step 5: Enable multipathing

There are two ways to enable multipathing under Linux -- using the official RDAC driver from IBM/LSI and the open source multipath tools of the device-mapper. This example uses the RDAC driver that is available on the LSI Web site.

Make sure to have the following additional packages installed:

  • gcc,
  • glibc-devel,
  • kernel-headers,
  • glibc-headers,
  • libgomp, and
  • kernel-devel or kernel-xen-devel if using the Xenified kernel.

Download the RDAC source from the LSI Web site and compile in /usr/src with the usual make && make install.

At the end, you will get messages like in Listing 6.

Listing 6. LSI RDAC driver installation
Checking Host Adapter Configuration...
Detected 2 LSI Host Adapter Port(s) on the system
Please wait while we modify the system configuration files.
Your kernel version is 2.6.18-92.el5xen
Preparing to install MPP driver against this kernel version...
Generating module dependencies...
Creating new MPP initrd image...
        You must now edit your boot loader configuration file, /boot/grub/menu.lst, to 
        add a new boot menu, which uses mpp-2.6.18-92.el5xen.img as the initrd image.
        Now Reboot the system for MPP to take effect.
        The new boot menu entry should look something like this (note that it may 
        vary with different system configuration):

        ...
        
                title Red Hat Linux (2.6.18-92.el5xen) with MPP support
                root (hd0,5)
                kernel /vmlinuz-2.6.18-92.el5xen ro root=LABEL=RH9
                initrd /mpp-2.6.18-92.el5xen.img
        ...
MPP driver package has been successfully installed on your system.

As the message says, edit your /boot/grub/grub.conf to use the new ramdisk. For a RHEL5.2 system running the Xen-enabled kernel, change to the following:

Listing 7. grub.conf using the multipath ramdisk
default=1
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Red Hat Enterprise Linux Server (2.6.18-92.el5xen)
        root (hd0,0)
        kernel /xen.gz-2.6.18-92.el5
        module /vmlinuz-2.6.18-92.el5xen ro root=/dev/VolGroup00/LogVol00
        module /initrd-2.6.18-92.el5xen.img

title Red Hat Enterprise Linux Server with MPP support (2.6.18-92.el5xen)
        root (hd0,0)
        kernel /xen.gz-2.6.18-92.el5
        module /vmlinuz-2.6.18-92.el5xen ro root=/dev/VolGroup00/LogVol00
        module /mpp-2.6.18-92.el5xen.img

mppUtil -a (Listing 8) shows all discovered arrays, in this case only one name Infra-sas2.

Listing 8. Basic multipath information
[root@localhost ~]# mppUtil -a
Hostname    = localhost
Domainname  = (none)
Time        = GMT 07/01/2008 19:56:01 

---------------------------------------------------------------
Info of Array Module's seen by this Host. 
---------------------------------------------------------------
ID              WWN                               Name         
---------------------------------------------------------------
 0      600a0b80002f746e0000000047d02718        Infra-sas2     
---------------------------------------------------------------

If you want basic LUN mapping information, issue /opt/mpp/lsvdev:

Listing 9. Basic LUN mapping information
[root@localhost ~]# /opt/mpp/lsvdev 
        Array Name      Lun    sd device
        -------------------------------------
        Infra-sas2      0     -> /dev/sda

mppUtil -a [array name] shows details about this array:

Listing 10. Detailed multipath information
[root@localhost ~]# mppUtil -a Infra-sas2
Hostname    = localhost
Domainname  = (none)
Time        = GMT 07/01/2008 19:56:27 

MPP Information:
----------------
      ModuleName: Infra-sas2                               SingleController: N
 VirtualTargetID: 0x000                                       ScanTriggered: N
     ObjectCount: 0x000                                          AVTEnabled: N
             WWN: 600a0b80002f746e0000000047d02718               RestoreCfg: N
    ModuleHandle: none                                        Page2CSubPage: Y
 FirmwareVersion: 6.30.1.xx                                   
   ScanTaskState: 0x00000000


Controller 'A' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: N                                                  Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL              
 hostId: 1, targetId: 0, channelId: 0
     

Controller 'B' Status:
-----------------------
ControllerHandle: none                                    ControllerPresent: Y
    UTMLunExists: N                                                  Failed: N
   NumberOfPaths: 1                                          FailoverInProg: N
                                                                ServiceMode: N

    Path #1
    ---------
 DirectoryVertex: present                                           Present: Y
       PathState: OPTIMAL              
 hostId: 1, targetId: 1, channelId: 0
     

Lun Information
---------------
    Lun #0 - WWN: 600a0b8000369d3f00000775481870af
    ----------------
       LunObject: present                                 CurrentOwningPath: A
  RemoveEligible: N                                          BootOwningPath: A
   NotConfigured: N                                           PreferredPath: A
        DevState: OPTIMAL                                   ReportedPresent: Y
                                                            ReportedMissing: N
                                                      NeedsReservationCheck: N
                                                                  TASBitSet: Y
                                                                   NotReady: N
                                                                       Busy: N
                                                                  Quiescent: N

    Controller 'A' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 1
         Path #1: LunPathDevice: present           
                        IoCount: 0
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

    Controller 'B' Path
    --------------------
   NumLunObjects: 1                                         RoundRobinIndex: 0
         Path #1: LunPathDevice: present           
                        IoCount: 0
                       DevState: OPTIMAL              
                    RemoveState: 0x0  StartState: 0x1  PowerState: 0x0

The last two sections show that currently controller A is handling the traffic (RoundRobinIndex: 1). Both paths are present and in an optimal state (LUNPathDevice present, DevState optimal).

Another way to display multipath information about all available arrays and LUNs is by issueing an ls -lR /proc/mpp -- it lists the available controllers and all available LUNs for each array. Details can be obtained by displaying the contents of the respective LUN proc entries. The intermediate directory mptsas_h1c0t0 reflects the Linux SCSI address (host, channel, target):

Listing 11. Multipath information from /proc
[root@localhost ~]# cat /proc/mpp/Infra-sas2/controllerA/mptsas_h1c0t0/LUN0 
Linux MPP driver. Version:09.01.C5.19 Build:Tue Apr  1 13:30:42 CDT 2008
Lun WWN:600a0b8000369d3f00000775481870af
Physical HBA driver: mptsas
Device Scsi Address: host_no:1 channel:0 target:0 Lun:0
Queue Depth = 64
I/O Statistics:
        Number of IOs:8846
        Longest trip of all I/Os:1
        Shortest trip of all I/Os:0
        Number of occurrences of IO failed events:0

MPP is configured through /etc/mpp.conf. If you make any changes you have to run mppUpdate to rebuild the ramdisk with the new config file. A reboot is required to activate the changes.

Step 6: How to extend with additional LUNs

If you need to assign more disk space to the operating system, you can easily do this by assigning more storage units to the blade, adding them to LVM, and extending the file system over the new disk space.

But before we go onto defining new units and using them, a word on the Logical Volume Manager (LVM). LVM provides an abstraction to the underlying storage. It groups one or more physical volumes (disks) into a volume group. This volume group serves as a logical container for one or more logical volumes which show up as a SCSI device in Linux (as you can see in Figure 7).

Figure 7. Logical Volume Manager
Logical Volume Manager

Now let's get on with it.

Define new units in storage manager

Create new storage units with the Storage Manager tool (see step 2).

Use them in Linux with LVM

It is a two-way process to use the new storage unit(s) in Linux. You have to add them to LVM before you can increase the actual Linux file system.

Start a rescan of the SAS bus with the hot_add utility (it is a link to mppBusRescan) to discover newly mapped storage units:

Listing 12. Discover new LUNs
[root@localhost ~]# hot_add 
Starting new devices re-scan...
scan mptsas HBA host /sys/class/scsi_host/host1...
        found 1:0:0:1 
        found 1:0:1:1 
scan mptsas HBA host /sys/class/scsi_host/host0...
        no new device found
run /usr/sbin/mppUtil -s busscan...
scan mpp virtual host /sys/class/scsi_host/host3...
        found 3:0:0:1->/dev/sdb 
/usr/sbin/hot_add is completed.

Linux has found one new unit and mapped it to /dev/sdb. Note that the new unit is automatically managed by the multipath driver.

To prepare the unit for LVM, create a primary LVM partition (partition type 8e) spanning the whole disk:

Listing 13. Prepare the new logical unit
[root@localhost ~]# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 51200.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-51200, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-51200, default 51200): 
Using default value 51200

Command (m for help): t
Selected partition 1
Hex code (type L to list codes): 8e
Changed system type of partition 1 to 8e (Linux LVM)

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Now integrate the new unit into the existing volume group. Listing 14 shows how you can verify with vgdisplay -s before and after the expansion:

Listing 14. Extend LVM volume group
[root@localhost ~]# vgdisplay -s
  "VolGroupBoot" 49.88 GB  [49.88 GB  used / 0    free]

[root@localhost ~]# pvcreate /dev/sdb1
  Physical volume "/dev/sdb1" successfully created   

[root@localhost ~]# vgextend VolGroupBoot /dev/sdb1
  Volume group "VolGroupBoot" successfully extended

[root@localhost ~]# vgdisplay -s
  "VolGroupBoot" 49.88 GB  [49.88 GB  used / 49.97    free]

Next, increase the logical volume within this volume group. You can verify with lvdisplay by comparing the LV Size row:

Listing 15. Extend LVM logical volume
[root@localhost ~]# lvdisplay /dev/VolGroupBoot/LogVolSlash
  --- Logical volume ---
  LV Name                /dev/VolGroupBoot/LogVolSlash
  VG Name                VolGroupBoot
  LV UUID                dIanEg-J6Mf-60Ec-eUKb-rgoJ-dOM0-QmjQ3Q
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                47.94 GB
  Current LE             1534
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

[root@localhost ~]# lvextend -l +100%FREE /dev/VolGroupBoot/LogVolSlash /dev/sdb1
  Extending logical volume LogVolSlash to 97.91 GB
  Logical volume LogVolSlash successfully resized

[root@localhost ~]# lvdisplay /dev/VolGroupBoot/LogVolSlash
  --- Logical volume ---
  LV Name                /dev/VolGroupBoot/LogVolSlash
  VG Name                VolGroupBoot
  LV UUID                dIanEg-J6Mf-60Ec-eUKb-rgoJ-dOM0-QmjQ3Q
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                97.91 GB
  Current LE             3133
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0

Now you are ready to resize your file system (takes about two minutes for 50GB on the SAS drives). If you resize your root file system, only ext3 is supported for online resizing (which means you don't have to schedule a maintenance window for the system).

Listing 16. Resize Linux file system
[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupBoot-LogVolSlash      47G  2.3G   42G   6% /

[root@localhost ~]# resize2fs -p /dev/VolGroupBoot/LogVolSlash
resize2fs 1.39 (29-May-2006)
Filesystem at /dev/VolGroupBoot/LogVolSlash is mounted on /; on-line resizing required
Performing an on-line resize of /dev/VolGroupBoot/LogVolSlash to 25665536 (4k) blocks.
The filesystem on /dev/VolGroupBoot/LogVolSlash is now 25665536 blocks long.

[root@localhost ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupBoot-LogVolSlash
                       95G  2.3G   88G   3% /

And that's it!

Conclusion

I've shown you how to set up a successful boot for a BladeCenter server from an SAS-controlled IBM BladeCenter Boot Disk System (DS3200) and you've also learned

  • How to enable multipathing to ensure efficient fault tolerance.
  • How to extend the file system to keep up with growing storage demands using the hot-add feature of the LVM.

Remember, using only external storage offers you improved availability of your blade server; in this article we focused on the benefit you get from not having to interrupt your server to replace or upgrade drives.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=358059
ArticleTitle=Blades and external storage: Set up a fault-tolerant environment
publish-date=12102008