IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
    
     Home      Products      Services & solutions      Support & downloads      My account     
 
developerworks > My developerWorks >  Dashboard > Linux for Power Architecture > ... > High Availability > Heartbeat
developerWorks
Log In   View a printable version of the current page.
Overview Connect Spaces Forums Wikis
Heartbeat
Added by mperzl, last edited by mperzl on Dec 15, 2005  (view change)
Labels: 
(None)

In most environments, customers and users are very concerned about system uptime, especially regarding running their daily applications. In this case study, we discuss the high availability solution that is bundled along with SUSE Linux Enterprise Server (SLES) 8 and the IBM Data Management Solution, DB2®. While this setup is focused on how to implement database failover, it is applicable to, and similar for, other services.

Hardware and software components

In this case study, we have two logical partitions (LPARs), each with two processors (POWER4+ at 1.45 Ghz), 4 GB of memory, and one 36.4 Gb internal disk drive. In addition, the following adapters/peripherals are assigned to the
system:

  • 2 network interfaces (FC-4962 and FC-5700)
  • RJ45 UTP cross-cable
  • SCSI controller (FC-6203)
  • JBOD SCSI disk drawer (2104-DU3)
    The system is installed with SLES8 SP1, kernel 2.4.21-83 using the 64-bit kernel.
    The software stack that we used in the setup are:
  • Heartbeat-1.0.4-0 clustering stack
  • IBM DB2 8.1
    Heartbeat is an open source clustering solution written by Alan Robertson, which provides basic clustering solutions. Heartbeat monitors the cluster resource either using network or serial adapters. It is also bundled with scripts to create cluster IP addresses and manage Linux Virtual Service (LVS) and other applications.

In this case study, we set up a cluster with a DB2 database in the external disk and cluster IP address for remote connection. Figure 8-3 shows the cluster that we will be setting up.

HEARTBEAT SETUP PICTURE MISSING !!!

Preparing the nodes for cluster ready

As with most high availability clusters, we have here external storage for disk takeover, network adapters, a dedicated LAN for heartbeats, and so on. In our setup, we call the servers that we are attaching to the disk lpar1 and lpar3; lpar1 is the primary server, and lpar3 is the failover server.

Network Setup

First we assign IP addresses to the systems. We assign eth0 to be on the public LAN, and eth1 becomes the heartbeat LAN. In this setup, we are limited by the available adapters.

We recommend that you have serial connection between the two servers, as well. This ensures that you will have a non-IP -based heartbeat.

  • lpar1
    • eth0 - 192.168.100.77
    • eth1 - 10.10.10.77
  • lpar3
    • eth0 - 192.168.100.79
    • eth1 - 10.10.10.79

Disk Setup

Data disk is the most critical component in almost any clustering solution. Data disk ensures that the application reads the latest data from storage.

In our cluster, we are using the IBM SCSI-based external disk storage solution, also known as the 2104-DU3 with the IBM Ultra3 SCSI adapters. The 2104-DU3 storage has capability as a single bus or a twin-bus configuration. In cluster setups, we require the storage to be a single bus. Figure 8-4 explains the single bus, dual-host configuration.

SCSI PICTURE MISSING, page 393

For information on how to change the storage to a single bus, please refer to the 2104-DU3 Installation Guide, GA33-3311.

After setting up the storage for single bus, we need to change the SCSI ID of the adapters in the servers. The SCSI adapters defaults itself to SCSI ID 7, and that creates a conflict if both servers are booted at the same time. Therefore, we change the lpar1 SCSI ID to 5 and lpar3 SCSI ID to 6. The SCSI driver for the IBM FC-6203 SCSI adapter is "sym53c8xx". This driver is compiled natively into the SLES8 Linux kernel.

We try out the new SCSI ID by entering the following string into the yaboot prompt:

yaboot : linux root=/dev/sd2 sym53c8xx=hostid:5

After the system is booted up, we change the SCSI of the server permanently by appending the above parameters into the kernel image so that it always is called prior to loading the kernel.

Obtain the ''mkzimage_command'' from the /ppc/netboot/ directory inside your SLES 8 CD:

# cp /boot/vmlinuz .
# mkzimage_command -c ./vmlinuz
# mkzimage_command -a 1 -s "root=/dev/sd2 sym53c8xx=hostid:5" ./vmlinuz
# cp vmlinuz/boot/vmlinuz.051103

Next, update the /etc/lilo.conf file with the new image, as shown in Example 8-15.

Example 8-15 Updating LILO configuration file with new kernel

# Generated by YaST2

default=test
timeout=100
boot=/dev/sda1
activate

image = /boot/vmlinuz
        label = linux
        root = /dev/sda2
        append = ""
image = /boot/vmlinuz.051103
        label = test
        root = /dev/sda2
        append = ""

Then we ran the ''lilo'' command to load the kernel and reboot the system, and did the same for lpar3 with SCSI ID6.

With both lpar1 and lpar3 capable of seeing the storage, we now create the disk partitions for storing our data. To create the new partition, we used the command ''fdisk''.

In Example 8-16, we create a 10 Gb disk partition in the external storage. Once the partition is created, it is instantly visible to lpar3 as well. We run the command ''fdisk -l'' to check.

Example 8-16 Creating a new 10 Gb partition in the newly added disk

leecy@lpar1:~ # fdisk /dev/sdc
The number of cylinders for this disk is set to 34715.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): p

Disk /dev/sdc: 64 heads, 32 sectors, 34715 cylinders
Units = cylinders of 2048 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System

Command (m for help): n
Command action
   e extended
   p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-34715, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-34715, default 34715): +10GB

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Now that we have the disk up and running, we create the file system and directories for our application:

# mkdir /data/IBM/db2inst1
# mkfs.reiserfs /dev/sdc1
# mount /dev/sdc1 /data/IBM/db2inst1

Note: We are using IBM DB2 as our application.

Once we mount the file system, we can proceed to install our application.

Application Installation

Prior to installing IBM DB2, we run through the hardware and software requirements:

  • 650 Mb of disk space for full installation
  • IBM JDK 1.3.1 for DB2 control center
  • Flex-2.5.4a-39
  • Web browser for online help
    At the following site, you can check the tested kernel against the DB2 release that you are going to install:
    http://www.ibm.com/db2/linux/validate

Once we have all the correct software and hardware requirements, we proceed to the installation of DB2.

DB2 Installation

After mounting the CD into the CD-ROM drive, we run the command ''db2setup'' in the root of the CD-ROM directory. A screen appears, as shown in Figure 8-5 on page 381.

DB2 INSTALL PICTURE MISSING

We select '''Install Products'''. This presents us with choices of installation and the type of installation we want to install for the server. Figure 8-6 on page 382 shows the types of installation choices available.

DB2 INSTALL PICTURE MISSING

We select '''DB2 UDB Enterprise Server Edition''' and click Next. In the next screen, we are asked to accept the IBM DB2 user license agreement, then we click Next.

Then we are prompted to select the type of DB2 UDB ESE we want to install (Typical, Compact, Custom), as shown in Figure 8-7 on page 383. We select Typical and proceed.

DB2 INSTALL PICURE MISSING

Next, when prompted, we create the necessary DB2 IDs. At this stage, we are still using the internal disk to store DB2. We will move the database to the external storage once we have both nodes installed.

Once we have done the selecting, installation will start. After it finishes, we check the post-installation report to make sure all components are installed. Figure 8-8 on page 384 shows the sample post-installation report.

DB2 INSTALL PICURE MISSING

Now that DB2 is installed, we need to create a database to use. We plan to use the DB2 sample bundled with DB2. To load the sample database, we run the ''db2sampl'' command as db2inst1 user.

Next, we disable DB2 from starting automatically when the system is booted up. We comment out the DB2 entry inside the /etc/inittab file. We want the clustering solution to automatically bring up DB2 for us, instead.

Now that DB2 is properly set up, we test the sample database that we loaded. We connect to the database locally and do a simple query. Example 8-17 shows a successful connection to the sample database.

Example 8-17 Testing the connection to the DB2 sample database

lpar1:/home # su - db2inst1
db2inst1@lpar1:~> db2 connect to sample
Database Connection Information
Database server = DB2/LINUXPPC 8.1.2
SQL authorization ID = DB2INST1
Local database alias = SAMPLE

Next, we do a simple query to make sure that we can query the database. Example 8-18 shows a simple query of the local database.

Example 8-18 Querying the DB2 sample database

db2inst1@lpar1:~> db2 "select * from ORG"
DEPTNUMB DEPTNAME MANAGER DIVISION LOCATION
-------- -------------- ------- ---------- -------------
10 Head Office 160 Corporate New York
15 New England 50 Eastern Boston
20 Mid Atlantic 10 Eastern Washington
38 South Atlantic 30 Eastern Atlanta
42 Great Lakes 100 Midwest Chicago
51 Plains 140 Midwest Dallas
66 Pacific 270 Western San Francisco
84 Mountain 290 Western Denver
8 record(s) selected.

We now do a similar installation in lpar3. (Select the same directory as in lpar1 to store the user directories, because this will ensure that DB2 sets up the profile and directories properly.)

After the installation of DB2 in both nodes, we move the database that we created in lpar1 to the external disk. We mount the disk from external storage, copy the entire DB2 instance folder into the external disk, and create a soft link. When the DB2 instance (db2inst1) is loaded, it is using the external disk:

# mount /dev/sdc1 /IBM/opt/db2inst1
# cp /home/db2inst1 /IBM/opt/db2inst1
# ln -s /IBM/opt/db2inst1/db2inst1 /home/db2inst1
# chown db2inst1.db2grp1 /home/db2inst1
# umount /IBM/opt/db2inst1

With lpar1 working, we now create a similar soft link in lpar3:

# mv /home/db2inst1 /home/db2inst1.orig
# mkdir /IBM/data/db2inst1
# mount /dev/sdc1 /IBM/data/db2inst1
# ln -s /IBM/data/db2inst1 /home/db2inst1
# chown db2inst1.db2grp1 /home/db2inst1

Heartbeat clustering software installation

With DB2 installed, we now install and configure the heartbeat clustering solution to manage our storage, application, and IP address. We install the following packages into our two systems (lpar1 and lpar3). The packages are available in the SLES 8 CD:

# rpm -ivh heartbeat-ldirectord-1.0.4-0.rpm
# rpm -ivh heartbeat-1.0.4-0.rpm
# rpm -ivh heartbeat-stonith-1.0.4-0.rpm

In the heartbeat clustering, there are three major configuration files:

  • /etc/ha.d/authkeys
  • /etc/ha.d/ha.cf
  • /etc/ha.d/haresources
    In the following sections, we describe each file.

/etc/ha.d/authkeys

The authkeys configuration file specifies the secret authentication keys that must be identical for both nodes in the cluster. There are several different authentication encryptions you can choose. In our setup, we use md5. The authkeys is shown in Example 8-19.

Example 8-19 /etc/ha.d/authkeys

lpar1:/etc/ha.d # cat authkeys
# key for the cluster is linuxforp
auth 3
3 md5 linuxforp

The authkeys must be set to read only by the root user; otherwise, the heartbeat software will fail right away:

# chmod 600 /etc/ha.d/authkeys

/etc/ha.d/ha.cf

The ha.cf file is the core configuration file that defines the nodes which are part of the clustering. In this file, we also define which link we use for the heartbeat, and the sequence of the heartbeat. Example 8-20 shows the ha.cf configuration file that we use in our configuration.

Example 8-20 /etc/ha.d/ha.cf

  1. Logs definition
    debugfile /var/log/ha-debug
    logfile /var/log/ha-log
    logfacility local10
  1. HeartBeat Packets Configuration
    keepalive 2 # time between each heartbeat
    deadtime 30 # how long to declare dead
    bcast eth1 # heartbeat communication link
  1. Resource Configuration
    nice_failback on # this will turn on the feature cascading without fall-back
  1. Node Definition
    node lpar1
    node lpar3

/etc/ha.d/haresources

The haresources file manages the resources that you want to be part of the cluster. The heartbeat looks into the /etc/ha.d/resource.d file and the /etc/init.d/ directory for scripts to start your application that you specify in the haresources file. In our setup, we required a cluster IP, the external storage to be automatically mounted, and then DB2 to be started.

Example 8-21 /etc/ha.d/haresources
lpar1 192.168.100.85 Filesystem::/dev/sdc1::/data/IBM/db2inst1::reiserfs db2::db2inst1
Based on the logic of how the application is started, we create the haresources configuration file. Example 8-21 shows the contents of the file we have. It tells heartbeat to make lpar1 be the primary node with cluster IP address
192.168.100.85, and then mount the file system to the mountpoint /data/IBM/db2inst1, and then start DB2 with the instance ID db2inst1.

After this is done, we customize the DB2 script located inside /etc/ha.d/resource.d/ for DB2 8 ESE. Because DB2 is parallel database-capable, we added these lines into the script, as shown in Example 8-22.

Example 8-22 Extract of /etc/ha.d/resource.d/db2

:
:
:
db2_start() {
#### included for DB2 8.1 EEE
NODENAME=`hostname`
cp /home/db2inst1/sqllib/db2nodes.cfg.$NODENAME
/home/db2inst1/sqllib/db2nodes.cfg
#### included for DB2 8.1 EEE
if
output=`runasdb2 $db2adm/db2start`
then
: Hurray! DB2 started OK
ha_log "info: DB2 UDB instance $1 started: $output"
else
case $output in
SQL1026N*|*"is already active"*)
ha_log "info: DB2 UDB instance $1 already running: $output";;
*) ha_log "ERROR: $output"; return 1;;
esac
fi
:
:
:

We need to create the necessary db2nodes.cfg for lpar1 and lpar3 inside
/home/db2inst1/sqllib/:

  • /home/db2inst1/sqllib/db2nodes.cfg.lpar1 contains:
    - 0 lpar1 0
    
  • /home/db2inst1/sqllib/db2nodes.cfg lpar3 contains:
    - 0 lpar3 0
    

    Now that we have all the configuration files ready, we copy the configuration to lpar3:

    # cd /etc/ha.d
    # scp haauthkeys ha.cf haresources resource.d/db2 root@lpar3:/etc/ha.d/
    

    Next, we do a basic check on the cluster setup by using the command BasicSanityCheck.

This command is found in the directory /usr/lib/heartbeat. This command performs basic checks and outputs the errors (if any) into the
/tmp/linux-ha.testlog file.

Example 8-23 on page 389 shows the BasicSanityCheck command that we ran during the creation of the configuration file for our cluster.

Example 8-23 BasicSanityCheck on the heartbeat cluster

leecy@lpar1:/usr/lib/heartbeat # ./BasicSanityCheck
Starting heartbeat
Starting High-Availability services
done

Reloading heartbeat

Reloading heartbeat
Stopping heartbeat
Stopping High-Availability services
done
Checking STONITH basic sanity.
Performing apphbd success case tests
Performing apphbd failure case tests
Starting IPC tests
1 errors. Log file is stored in /tmp/linux-ha.testlog

Testing the cluster

Once done, we test the cluster. We start the cluster by using the command ''/etc/init.d/heartbeat'' start as shown in Example 8-24.

Example 8-24 Starting the heartbeat cluster

lpar1:~ # /etc/init.d/heartbeat start
Starting High-Availability services
done
lpar1:~ #

We started the cluster on both nodes and then noticed that the cluster IP address automatically gets created in the primary node. At the same time, the file system gets mounted and the application started.

During this process, the /var/log/ha-log shows details of what is happening in the background. This log is also very useful for debugging if the resource fails to start. Example 8-25 shows the ha-log of our cluster with the resource successfully started.

Example 8-25 /var/log/ha-log output when cluster is up

# tail -f /var/log/ha-log
heartbeat: 2003/11/10_17:53:34 info: **************************
heartbeat: 2003/11/10_17:53:34 info: Configuration validated. Starting
heartbeat 1.0.4
heartbeat: 2003/11/10_17:53:34 info: nice_failback is in effect.
heartbeat: 2003/11/10_17:53:34 info: heartbeat: version 1.0.4
heartbeat: 2003/11/10_17:53:34 info: Heartbeat generation: 30
heartbeat: 2003/11/10_17:53:34 info: UDP Broadcast heartbeat started on port 694 (694) interface eth1
heartbeat: 2003/11/10_17:53:34 info: pid 2795 locked in memory.
heartbeat: 2003/11/10_17:53:35 info: pid 2797 locked in memory.
heartbeat: 2003/11/10_17:53:35 info: pid 2799 locked in memory.
heartbeat: 2003/11/10_17:53:35 info: pid 2798 locked in memory.
heartbeat: 2003/11/10_17:53:35 info: Local status now set to: 'up'
heartbeat: 2003/11/10_17:53:36 info: Link lpar1:eth1 up.
heartbeat: 2003/11/10_17:54:05 WARN: node lpar3: is dead
heartbeat: 2003/11/10_17:54:05 WARN: No STONITH device configured.
heartbeat: 2003/11/10_17:54:05 WARN: Shared resources (storage!) are not protected!
heartbeat: 2003/11/10_17:54:05 info: Resources being acquired from lpar3.
heartbeat: 2003/11/10_17:54:05 info: Local status now set to: 'active'
heartbeat: 2003/11/10_17:54:05 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2003/11/10_17:54:05 info: /usr/lib/heartbeat/mach_down: nice_failback: acquiring foreign resources
heartbeat: 2003/11/10_17:54:05 info: mach_down takeover complete.
heartbeat: 2003/11/10_17:54:05 info: mach_down takeover complete for node lpar3.
heartbeat: 2003/11/10_17:54:05 info: Resource acquisition completed.
heartbeat: 2003/11/10_17:54:05 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp
heartbeat: 2003/11/10_17:54:05 received ip-request-resp 192.168.100.85 OK yes
heartbeat: 2003/11/10_17:54:05 info: Acquiring resource group: lpar1 192.168.100.85 filesystem::/dev/sdb1::/data/IBM/db2inst1::reiserfs db2::db2inst1
heartbeat: 2003/11/10_17:54:05 info: Running /etc/ha.d/resource.d/IPaddr 192.168.100.85 start
heartbeat: 2003/11/10_17:54:05 info: /sbin/ifconfig eth0:0 192.168.100.85 netmask 255.255.255.0 broadcast 192.168.100.255
heartbeat: 2003/11/10_17:54:05 info: Sending Gratuitous Arp for 192.168.100.85 on eth0:0 [eth0]
heartbeat: 2003/11/10_17:54:05 /usr/lib/heartbeat/send_arp eth0 192.168.100.85 0002553A068C 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_17:54:05 info: Running /etc/ha.d/resource.d/Filesystem /dev/sdb1 /data/IBM/db2inst1 reiserfs start
heartbeat: 2003/11/10_17:54:06 info: Running /etc/ha.d/resource.d/db2 db2inst1 start
heartbeat: 2003/11/10_17:54:07 /usr/lib/heartbeat/send_arp eth0 192.168.100.85 0002553A068C 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_17:54:08 info: DB2 UDB instance db2inst1 started: 11-10-2003 17:54:08 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
heartbeat: 2003/11/10_17:54:09 /usr/lib/heartbeat/send_arp eth0 192.168.100.85 0002553A068C 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_17:54:11 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A068C 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_17:54:13 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A068C 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_17:54:17 info: Local Resource acquisition completed.
(none)
heartbeat: 2003/11/10_17:54:17 info: local resource transition completed.
heartbeat: 2003/11/10_17:54:28 info: Link lpar3:eth1 up.
heartbeat: 2003/11/10_17:54:28 info: Status update for node lpar3: status up
heartbeat: 2003/11/10_17:54:28 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2003/11/10_17:54:29 info: Status update for node lpar3: status active
heartbeat: 2003/11/10_17:54:29 info: Running /etc/ha.d/rc.d/status status

Next, we power off lpar1. The resource instantly fails over to lpar3, with the cluster IP addresses created as well, as in Figure 8-9.

DB2 FAILOVER PICTURE MISSING

The entries in /var/log/ha-log show that lpar1 has failed and the resources fails over to lpar3. Example 8-26 on page 392 shows /var/log/ha-log during our test.

heartbeat: 2003/11/10_19:32:16 WARN: node lpar1: is dead
heartbeat: 2003/11/10_19:32:16 WARN: No STONITH device configured.
heartbeat: 2003/11/10_19:32:16 WARN: Shared resources (storage!) are not protected!
heartbeat: 2003/11/10_19:32:16 info: Resources being acquired from lpar1.
heartbeat: 2003/11/10_19:32:16 info: Link lpar1:eth1 dead.
heartbeat: 2003/11/10_19:32:16 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2003/11/10_19:32:16 info: No local resources
[/usr/lib/heartbeat/ResourceManager listkeys lpar3]
heartbeat: 2003/11/10_19:32:16 info: Resource acquisition completed.
heartbeat: 2003/11/10_19:32:16 info: Taking over resource group 192.168.100.85
heartbeat: 2003/11/10_19:32:16 info: Acquiring resource group: lpar1
192.168.100.85 Filesystem::/dev/sdb1::/data/IBM/db2inst1::reiserfs
db2::db2inst1
heartbeat: 2003/11/10_19:32:16 info: Running /etc/ha.d/resource.d/IPaddr
192.168.100.85 start
heartbeat: 2003/11/10_19:32:17 info: /sbin/ifconfig eth0:0 192.168.100.85
netmask 255.255.255.0 broadcast 192.168.100.255
heartbeat: 2003/11/10_19:32:17 info: Sending Gratuitous Arp for 192.168.100.85
on eth0:0 [eth0]
heartbeat: 2003/11/10_19:32:17 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A0619 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_19:32:17 info: Running /etc/ha.d/resource.d/Filesystem
/dev/sdb1 /data/IBM/db2inst1 reiserfs start
heartbeat: 2003/11/10_19:32:18 info: Running /etc/ha.d/resource.d/db2 db2inst1 start
heartbeat: 2003/11/10_19:32:19 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A0619 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_19:32:21 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A0619 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_19:32:21 info: DB2 UDB instance db2inst1 started:
11-10-2003 19:32:21 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
heartbeat: 2003/11/10_19:32:23 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A0619 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_19:32:25 /usr/lib/heartbeat/send_arp eth0 192.168.100.85
0002553A0619 192.168.100.85 ffffffffffff
heartbeat: 2003/11/10_19:32:27 info: mach_down takeover complete for node lpar1.

Now we are confident that our cluster is working as designed, so we add the
heartbeat to the startup script for both nodes with the following command:

# chkconfig heartbeat 3

Extending the cluster with Apache and PHP for front-end

Using LVS to load balance Web servers

Final landscape of the setup

In the final setup, we have two servers running ldirectord acting as load balancing two Web servers and two clustered database servers. This is a fully redundant solution where both the Web servers are actively talking to the database servers. The final setup looks like the diagram in Figure 8-15 on page 405.

MISSING FIGURE 8-15

Alternate solutions to improve the cluster

In this section, we discuss alternate solutions to improve the cluster.

Other disk solution

One critical improvement that we would like to see in the cluster is the use of Fibre Channel disks instead of SCSI disks. The IBM Fibre Channel adapter (F/C 6228) is compatible with the lpfcdd driver.

We load it by running the command:

# modprobe lpfcdd

After it is loaded, we can create the disk partitions and file systems using steps that are similar to those used in the SCSI solution; refer to "Disk setup" on page 377.

Other software stack

Besides using Apache and PHP, we can also use WebSphere as the middleware. WebSphere is a J2EE application server supported on Linux for pSeries, and it offers a whole suite of applications and portlets.

Other clustering solutions

  • Distributed Replicated Block Device (DRBD)
    DRBD is designed to mirror a whole block device via a network. It basically takes the data and ships it across to the other node through a network. On the remote node, the data will be written to the disk. This essentially is Network RAID-1 for storage. This is ideal for Web servers, ftp servers and database servers for data reliability. Using DRDB with a clustering solution allows you to have the most recent and up-to-date data through real time mirroring.
  • General Parallel File system (GPFS)
    GPFS is a shared-disk file system that provides data access for GPFS clients. The data can be spanned across multiple GPFS nodes and the files can be accessed concurrently from multiple nodes. GPFS provides great performance and availability through logging and replication. It also offers options which can be configured for failover from both disk and server malfunctions.
  • Other clustering solution
    While heartbeat is good for a small cluster, it lacks many of the enterprise features to handle disk quorum, different types of clusters, and so on. For enterprise customers, you can also try Tivoli System Automation.<br>Heartbeat is the cluster solution bundled with SLES 8. For any other open source clustering solutions, you might need to compile the software products manually to run and work in SLES 8.


 
    About IBM Privacy Contact