In most environments, customers and users are very concerned about system uptime, especially regarding running their daily applications. In this case study, we discuss the high availability solution that is bundled along with SUSE Linux Enterprise Server (SLES) 8 and the IBM Data Management Solution, DB2®. While this setup is focused on how to implement database failover, it is applicable to, and similar for, other services.
Hardware and software components
In this case study, we have two logical partitions (LPARs), each with two processors (POWER4+ at 1.45 Ghz), 4 GB of memory, and one 36.4 Gb internal disk drive. In addition, the following adapters/peripherals are assigned to the
system:
- 2 network interfaces (FC-4962 and FC-5700)
- RJ45 UTP cross-cable
- SCSI controller (FC-6203)
- JBOD SCSI disk drawer (2104-DU3)
The system is installed with SLES8 SP1, kernel 2.4.21-83 using the 64-bit kernel.
The software stack that we used in the setup are:
- Heartbeat-1.0.4-0 clustering stack
- IBM DB2 8.1
Heartbeat is an open source clustering solution written by Alan Robertson, which provides basic clustering solutions. Heartbeat monitors the cluster resource either using network or serial adapters. It is also bundled with scripts to create cluster IP addresses and manage Linux Virtual Service (LVS) and other applications.
In this case study, we set up a cluster with a DB2 database in the external disk and cluster IP address for remote connection. Figure 8-3 shows the cluster that we will be setting up.
HEARTBEAT SETUP PICTURE MISSING !!!
Preparing the nodes for cluster ready
As with most high availability clusters, we have here external storage for disk takeover, network adapters, a dedicated LAN for heartbeats, and so on. In our setup, we call the servers that we are attaching to the disk lpar1 and lpar3; lpar1 is the primary server, and lpar3 is the failover server.
Network Setup
First we assign IP addresses to the systems. We assign eth0 to be on the public LAN, and eth1 becomes the heartbeat LAN. In this setup, we are limited by the available adapters.
We recommend that you have serial connection between the two servers, as well. This ensures that you will have a non-IP -based heartbeat.
- lpar1
- eth0 - 192.168.100.77
- eth1 - 10.10.10.77
- lpar3
- eth0 - 192.168.100.79
- eth1 - 10.10.10.79
Disk Setup
Data disk is the most critical component in almost any clustering solution. Data disk ensures that the application reads the latest data from storage.
In our cluster, we are using the IBM SCSI-based external disk storage solution, also known as the 2104-DU3 with the IBM Ultra3 SCSI adapters. The 2104-DU3 storage has capability as a single bus or a twin-bus configuration. In cluster setups, we require the storage to be a single bus. Figure 8-4 explains the single bus, dual-host configuration.
SCSI PICTURE MISSING, page 393
For information on how to change the storage to a single bus, please refer to the 2104-DU3 Installation Guide, GA33-3311.
After setting up the storage for single bus, we need to change the SCSI ID of the adapters in the servers. The SCSI adapters defaults itself to SCSI ID 7, and that creates a conflict if both servers are booted at the same time. Therefore, we change the lpar1 SCSI ID to 5 and lpar3 SCSI ID to 6. The SCSI driver for the IBM FC-6203 SCSI adapter is "sym53c8xx". This driver is compiled natively into the SLES8 Linux kernel.
We try out the new SCSI ID by entering the following string into the yaboot prompt:
After the system is booted up, we change the SCSI of the server permanently by appending the above parameters into the kernel image so that it always is called prior to loading the kernel.
Obtain the ''mkzimage_command'' from the /ppc/netboot/ directory inside your SLES 8 CD:
Next, update the /etc/lilo.conf file with the new image, as shown in Example 8-15.
Example 8-15 Updating LILO configuration file with new kernel
Then we ran the ''lilo'' command to load the kernel and reboot the system, and did the same for lpar3 with SCSI ID6.
With both lpar1 and lpar3 capable of seeing the storage, we now create the disk partitions for storing our data. To create the new partition, we used the command ''fdisk''.
In Example 8-16, we create a 10 Gb disk partition in the external storage. Once the partition is created, it is instantly visible to lpar3 as well. We run the command ''fdisk -l'' to check.
Example 8-16 Creating a new 10 Gb partition in the newly added disk
Now that we have the disk up and running, we create the file system and directories for our application:
Note: We are using IBM DB2 as our application.
Once we mount the file system, we can proceed to install our application.
Application Installation
Prior to installing IBM DB2, we run through the hardware and software requirements:
- 650 Mb of disk space for full installation
- IBM JDK 1.3.1 for DB2 control center
- Flex-2.5.4a-39
- Web browser for online help
At the following site, you can check the tested kernel against the DB2 release that you are going to install:
http://www.ibm.com/db2/linux/validate
Once we have all the correct software and hardware requirements, we proceed to the installation of DB2.
DB2 Installation
After mounting the CD into the CD-ROM drive, we run the command ''db2setup'' in the root of the CD-ROM directory. A screen appears, as shown in Figure 8-5 on page 381.
DB2 INSTALL PICTURE MISSING
We select '''Install Products'''. This presents us with choices of installation and the type of installation we want to install for the server. Figure 8-6 on page 382 shows the types of installation choices available.
DB2 INSTALL PICTURE MISSING
We select '''DB2 UDB Enterprise Server Edition''' and click Next. In the next screen, we are asked to accept the IBM DB2 user license agreement, then we click Next.
Then we are prompted to select the type of DB2 UDB ESE we want to install (Typical, Compact, Custom), as shown in Figure 8-7 on page 383. We select Typical and proceed.
DB2 INSTALL PICURE MISSING
Next, when prompted, we create the necessary DB2 IDs. At this stage, we are still using the internal disk to store DB2. We will move the database to the external storage once we have both nodes installed.
Once we have done the selecting, installation will start. After it finishes, we check the post-installation report to make sure all components are installed. Figure 8-8 on page 384 shows the sample post-installation report.
DB2 INSTALL PICURE MISSING
Now that DB2 is installed, we need to create a database to use. We plan to use the DB2 sample bundled with DB2. To load the sample database, we run the ''db2sampl'' command as db2inst1 user.
Next, we disable DB2 from starting automatically when the system is booted up. We comment out the DB2 entry inside the /etc/inittab file. We want the clustering solution to automatically bring up DB2 for us, instead.
Now that DB2 is properly set up, we test the sample database that we loaded. We connect to the database locally and do a simple query. Example 8-17 shows a successful connection to the sample database.
Example 8-17 Testing the connection to the DB2 sample database
Next, we do a simple query to make sure that we can query the database. Example 8-18 shows a simple query of the local database.
Example 8-18 Querying the DB2 sample database
We now do a similar installation in lpar3. (Select the same directory as in lpar1 to store the user directories, because this will ensure that DB2 sets up the profile and directories properly.)
After the installation of DB2 in both nodes, we move the database that we created in lpar1 to the external disk. We mount the disk from external storage, copy the entire DB2 instance folder into the external disk, and create a soft link. When the DB2 instance (db2inst1) is loaded, it is using the external disk:
With lpar1 working, we now create a similar soft link in lpar3:
Heartbeat clustering software installation
With DB2 installed, we now install and configure the heartbeat clustering solution to manage our storage, application, and IP address. We install the following packages into our two systems (lpar1 and lpar3). The packages are available in the SLES 8 CD:
In the heartbeat clustering, there are three major configuration files:
- /etc/ha.d/authkeys
- /etc/ha.d/ha.cf
- /etc/ha.d/haresources
In the following sections, we describe each file.
/etc/ha.d/authkeys
The authkeys configuration file specifies the secret authentication keys that must be identical for both nodes in the cluster. There are several different authentication encryptions you can choose. In our setup, we use md5. The authkeys is shown in Example 8-19.
Example 8-19 /etc/ha.d/authkeys
The authkeys must be set to read only by the root user; otherwise, the heartbeat software will fail right away:
/etc/ha.d/ha.cf
The ha.cf file is the core configuration file that defines the nodes which are part of the clustering. In this file, we also define which link we use for the heartbeat, and the sequence of the heartbeat. Example 8-20 shows the ha.cf configuration file that we use in our configuration.
Example 8-20 /etc/ha.d/ha.cf
- Logs definition
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local10
- HeartBeat Packets Configuration
keepalive 2 # time between each heartbeat
deadtime 30 # how long to declare dead
bcast eth1 # heartbeat communication link
- Resource Configuration
nice_failback on # this will turn on the feature cascading without fall-back
- Node Definition
node lpar1
node lpar3
/etc/ha.d/haresources
The haresources file manages the resources that you want to be part of the cluster. The heartbeat looks into the /etc/ha.d/resource.d file and the /etc/init.d/ directory for scripts to start your application that you specify in the haresources file. In our setup, we required a cluster IP, the external storage to be automatically mounted, and then DB2 to be started.
Example 8-21 /etc/ha.d/haresources
lpar1 192.168.100.85 Filesystem::/dev/sdc1::/data/IBM/db2inst1::reiserfs db2::db2inst1
Based on the logic of how the application is started, we create the haresources configuration file. Example 8-21 shows the contents of the file we have. It tells heartbeat to make lpar1 be the primary node with cluster IP address
192.168.100.85, and then mount the file system to the mountpoint /data/IBM/db2inst1, and then start DB2 with the instance ID db2inst1.
After this is done, we customize the DB2 script located inside /etc/ha.d/resource.d/ for DB2 8 ESE. Because DB2 is parallel database-capable, we added these lines into the script, as shown in Example 8-22.
Example 8-22 Extract of /etc/ha.d/resource.d/db2
We need to create the necessary db2nodes.cfg for lpar1 and lpar3 inside
/home/db2inst1/sqllib/:
- /home/db2inst1/sqllib/db2nodes.cfg.lpar1 contains:
- /home/db2inst1/sqllib/db2nodes.cfg lpar3 contains:
Now that we have all the configuration files ready, we copy the configuration to lpar3:
Next, we do a basic check on the cluster setup by using the command BasicSanityCheck.
This command is found in the directory /usr/lib/heartbeat. This command performs basic checks and outputs the errors (if any) into the
/tmp/linux-ha.testlog file.
Example 8-23 on page 389 shows the BasicSanityCheck command that we ran during the creation of the configuration file for our cluster.
Example 8-23 BasicSanityCheck on the heartbeat cluster
Testing the cluster
Once done, we test the cluster. We start the cluster by using the command ''/etc/init.d/heartbeat'' start as shown in Example 8-24.
Example 8-24 Starting the heartbeat cluster
We started the cluster on both nodes and then noticed that the cluster IP address automatically gets created in the primary node. At the same time, the file system gets mounted and the application started.
During this process, the /var/log/ha-log shows details of what is happening in the background. This log is also very useful for debugging if the resource fails to start. Example 8-25 shows the ha-log of our cluster with the resource successfully started.
Example 8-25 /var/log/ha-log output when cluster is up
Next, we power off lpar1. The resource instantly fails over to lpar3, with the cluster IP addresses created as well, as in Figure 8-9.
DB2 FAILOVER PICTURE MISSING
The entries in /var/log/ha-log show that lpar1 has failed and the resources fails over to lpar3. Example 8-26 on page 392 shows /var/log/ha-log during our test.
Now we are confident that our cluster is working as designed, so we add the
heartbeat to the startup script for both nodes with the following command:
Extending the cluster with Apache and PHP for front-end
Using LVS to load balance Web servers
Final landscape of the setup
In the final setup, we have two servers running ldirectord acting as load balancing two Web servers and two clustered database servers. This is a fully redundant solution where both the Web servers are actively talking to the database servers. The final setup looks like the diagram in Figure 8-15 on page 405.
MISSING FIGURE 8-15
Alternate solutions to improve the cluster
In this section, we discuss alternate solutions to improve the cluster.
Other disk solution
One critical improvement that we would like to see in the cluster is the use of Fibre Channel disks instead of SCSI disks. The IBM Fibre Channel adapter (F/C 6228) is compatible with the lpfcdd driver.
We load it by running the command:
After it is loaded, we can create the disk partitions and file systems using steps that are similar to those used in the SCSI solution; refer to "Disk setup" on page 377.
Other software stack
Besides using Apache and PHP, we can also use WebSphere as the middleware. WebSphere is a J2EE application server supported on Linux for pSeries, and it offers a whole suite of applications and portlets.
Other clustering solutions
- Distributed Replicated Block Device (DRBD)
DRBD is designed to mirror a whole block device via a network. It basically takes the data and ships it across to the other node through a network. On the remote node, the data will be written to the disk. This essentially is Network RAID-1 for storage. This is ideal for Web servers, ftp servers and database servers for data reliability. Using DRDB with a clustering solution allows you to have the most recent and up-to-date data through real time mirroring.
- General Parallel File system (GPFS)
GPFS is a shared-disk file system that provides data access for GPFS clients. The data can be spanned across multiple GPFS nodes and the files can be accessed concurrently from multiple nodes. GPFS provides great performance and availability through logging and replication. It also offers options which can be configured for failover from both disk and server malfunctions.
- Other clustering solution
While heartbeat is good for a small cluster, it lacks many of the enterprise features to handle disk quorum, different types of clusters, and so on. For enterprise customers, you can also try Tivoli System Automation.<br>Heartbeat is the cluster solution bundled with SLES 8. For any other open source clustering solutions, you might need to compile the software products manually to run and work in SLES 8.