Dynamic server provisioning with xCAT and TORQUE
This article describes a solution for building a dynamically provisioned high-performance computing (HPC) cluster system using the Extreme Cloud Administration Toolkit (xCAT) and Tera-scale Open-source Resource and QUEue Manager (TORQUE) open source packages. The xCAT is a leading solution for dynamically provisioning compute, storage, and network resources. TORQUE is a workload and resource-management system that manages batch jobs and compute nodes and schedules the execution of those jobs.
We build a cluster where nodes are provisioned with xCAT and on which batch jobs are managed and executed by TORQUE. On top of xCAT and TORQUE, we build a provisioning agent that makes the cluster adaptive, meaning that the compute nodes of the cluster are dynamically provisioned with the execution environment that the jobs require.
Architecture of the adaptive cluster
The architecture of the dynamic cluster we are building is shown in Figure 1, where the xCAT cluster consists of a management node and several compute nodes. The compute nodes are provisioned by the xCAT server running on the management node. The management node also runs the TORQUE server and scheduler daemons, as well as several services required to manage the compute nodes with xCAT, including DNS, DHCP, TFTP, and NFS.
The compute nodes run jobs that the TORQUE server dispatches and the TORQUE job-execution daemon running on each compute node starts. The provisioning agent examines the workload and the node configuration, and decides which nodes need to be provisioned to provide the execution environment that the jobs require.
Figure 1. The adaptive cluster
For a small cluster, a single management node can provide the bandwidth required to provision all the compute nodes. For larger clusters, a hierarchical approach is needed, with the management node being connected to two or more service nodes and the compute nodes being provisioned from the service nodes.
For the purpose of this article, consider a small cluster that has one management node —xcat1— and two compute nodes —xcat2 and xcat3— that are connected to xcat1 via the Ethernet switch xcat-switch, as shown in Figure 2. The servers used each have a dual-processor Intel® Xeon® x86_64 architecture, 2 GB of memory, 73 GB of disk capacity, and Ethernet interfaces that support Preboot eXecution Environment (PXE) booting. The management node runs CentOS release 5.4.
Figure 2. Cluster components and networking
The diskless method of provisioning the compute nodes is used, whereby the nodes boot from the management node. Specifically, PXE-based network booting is used.
Configure the management node
Before installing xCAT, configure the management node so xCAT is installed correctly and fetches the correct information about the cluster. This section shows the configuration actions you perform on the management node before installing xCAT.
Set up networking and host definitions
The management node xcat1 is connected to the public network 192.168.17.0 (called extnet) and to the cluster network 192.168.112.0 (called cluster). We use static IP addresses for both interfaces: The public network interface eth0 has the IP address 192.168.17.201; the cluster network interface eth1 has the IP address 192.168.112.1. Listing 1 shows the configured network interfaces, where virbr0 is useful for virtualization, but not for the setup discussed in this article.
Listing 1. Network interfaces
xcat1 # ifconfig -a | egrep -A1 '^[a-z]' | grep -v "\--" eth0 Link encap:Ethernet HWaddr 00:11:43:DF:0E:A8 inet addr:192.168.17.201 Bcast:192.168.17.255 Mask:255.255.255.0 eth1 Link encap:Ethernet HWaddr 00:11:43:DF:0E:A9 inet addr:192.168.112.1 Bcast:192.168.112.255 Mask:255.255.255.0 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 virbr0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:192.168.122.1 Bcast:192.168.122.255 Mask:255.255.255.0
The file /etc/sysconfig/network, which defines the host name, and the file
/etc/hosts, which defines the local host lookup table, are shown in Listing 2, where the Intelligent Platform Management
Interfaces (IPMIs) of the compute nodes are xcat2 and xcat3, respectively.
The IPMI is used by xCAT to power-cycle and boot the compute the nodes.
Check the local host short and full name using the
command, as shown in the listing.
Listing 2. Host definitions
xcat1 # more /etc/sysconfig/network NETWORKING=yes NETWORKING_IPV6=no HOSTNAME=xcat1 xcat1 # more /etc/hosts 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.17.201 xcat1.extnet 192.168.17.202 xcat1i.extnet 192.168.112.1 xcat1.cluster xcat1 192.168.112.100 xcat-switch 192.168.112.102 xcat2 192.168.112.103 xcat3 192.168.112.202 xcat2i 192.168.112.203 xcat3i xcat1 # hostname -s xcat1 xcat1 # hostname -f xcat1.cluster
Set up DNS
The resolver configuration file /etc/resolv.conf, shown in Listing 3, defines as the primary server the management node
192.168.112.1 (xcat1) and an external server as a secondary server. (We
set up the name server 192.168.112.1 using xCAT in the "Configure DNS and DHCP" section.) The
default setup of the named service on CentOS V5.4 is to use
bind-chroot. Because xCAT expects the named service not to
chroot, remove the package
shown in the listing.
Listing 3. DNS setup
xcat1 # more /etc/resolv.conf search cluster extnet nameserver 192.168.112.1 nameserver 126.96.36.199 xcat1 # rpm -q bind-chroot bind-chroot-9.3.6-4.P1.el5_4.2 xcat1 # rpm -e bind-chroot
Install the downloaded tools
Extract the ZIP file you downloaded and run the install.sh script to install the programs referred to in the remainder of this article.
The Security-enhanced Linux® (SELinux) functionality should be disabled. Additionally, if the tftp-server package is installed, remove it, because xCAT requires the atftp package, which conflicts with tftp-server. To run TORQUE jobs, create a regular user, then NFS-export that user's home directory. Listing 4 shows how to perform these actions.
Listing 4. Additional settings
xcat1 # rpm -q tftp-server tftp-server-0.49-2.el5.centos xcat1 # rpm -e tftp-server xcat1 # grep SELINUX= /etc/sysconfig/selinux | grep -v ^# SELINUX=disabled xcat1 # useradd -m -s /bin/bash -d /home/gabriel -u 101 -g users gabriel xcat1 # grep /home /etc/exports /home *(rw,no_root_squash,sync) xcat1 # exportfs -a xcat1 # showmount -e | grep /home /home *
Install and configure xCAT
To make installation easy, xCAT provides RPMs for the common Linux distributions. Because xCAT is a database-driven package, you configure its actions by setting the tables in the xCAT database.
Install xCAT with yum
To install xCAT, install the yum configuration files for the xCAT
repository, then run
yum to install the xCAT packages and
their dependencies, as shown in Listing 5. Two repositories are
configured: the core repository and the platform-specific repository.
Listing 5. Install xCAT with yum
xcat1 # cd /etc/yum.repos.d xcat1 # wget http://xcat.sourceforge.net/yum/xcat-core/xCAT-core.repo xcat1 # wget http://xcat.sourceforge.net/yum/xcat-dep/rh5/x86_64/xCAT-dep.repo xcat1 # yum clean metadata xcat1 # yum install xCAT.x86_64 [...] Complete! xcat1 # source /etc/profile.d/xcat.sh
When the installation is complete, the xCAT and the TFTP services should be configured and running. Verify that by running the commands shown in Listing 6.
Listing 6. The xCAT service
xcat1 # chkconfig --list xcatd xcatd 0:off 1:off 2:off 3:on 4:on 5:on 6:off xcat1 # service xcatd status xCAT service is running xcat1 # chkconfig --list tftpd tftpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off xcat1 # service tftpd status atftpd service is running
Install the downloaded tools
Extract the ZIP file you downloaded and run the included install.sh script, which installs the following programs referred to in the remainder of this article:
- The image configuration tools
gen_config_torque(installed in /opt/provisioner/bin) and
install_torque-mom(installed in /install/postscripts), which you need to tune the image for TORQUE
setup_torque_configscript used to set up the TORQUE queue and execution nodes (installed in /opt/provisioner/bin)
setup_usersscript you'll run as an xCAT postscript (installed in /install/postscripts)
- The provisioning agent
prov_agent, which performs dynamic provisioning (installed in /opt/provisioner/bin)
Set up the xCAT database tables
xCAT stores the cluster configuration in a database (the default database
type is SQLite). As part of the installation process, the tables in the
xCAT database are created and initialized. To change them, use the
chtab command; to view them, use the
This section describes the important settings in the xCAT tables. The complete contents of the tables is available in the Download section.
site table, set the
eth1 and make sure the
attributes are set as shown in Listing 7.
Listing 7. The site table
xcat1 # chtab key=dhcpinterfaces site.value=eth1 xcat1 # tabdump site | egrep "dhcp|domain|master|server|forward" | sort "dhcpinterfaces","eth1",, "domain","cluster",, "forwarders","188.8.131.52",, "master","192.168.112.1",, "nameservers","192.168.112.1",,
Define the compute node group in the
noderes table, define the
compute profile — a profile defines the packages included
in a root image — in the
nodetype table, as shown in
Listing 8. The noderes and nodetype tables
xcat1 # chtab node="compute" \ noderes.netboot=pxe \ noderes.tftpserver=192.168.112.1 \ noderes.nfsserver=192.168.112.1 \ noderes.installnic=eth0 \ noderes.primarynic=eth0 \ noderes.discoverynics=eth0 xcat1 # chtab node=compute \ nodetype.os=centos5 \ nodetype.arch=x86_64 \ nodetype.profile=compute \ nodetype.nodetype=osi xcat1 # chtab node=xcat2 nodetype.os=centos5 xcat1 # chtab node=xcat3 nodetype.os=centos5
Define the MAC addresses of the nodes in the
mac table, then
define the nodes to the
nodelist table. Configure the
hardware management of the nodes in the
nodehm table, as
shown in Listing 9.
Listing 9. The mac, nodelist, and nodehm tables
xcat1 # chtab node=xcat2 mac.interface=eth1 mac.mac="00:11:43:df:0f:09" xcat1 # chtab node=xcat2i mac.interface=eth1 mac.mac="00:11:43:df:0f:0b" xcat1 # chtab node=xcat-switch mac.interface=eth1 mac.mac="00:10:83:8d:5a:42" xcat1 # chtab node=xcat3 mac.interface=eth1 mac.mac="00:11:43:df:0d:1d" xcat1 # chtab node=xcat3i mac.interface=eth1 mac.mac="00:11:43:df:0d:1f" xcat1 # chtab node=xcat2 nodelist.groups="compute,all" xcat1 # chtab node=xcat3 nodelist.groups="compute,all" xcat1 # chtab node=xcat2i nodelist.groups="bmc,all" xcat1 # chtab node=xcat3i nodelist.groups="bmc,all" xcat1 # chtab node=xcat-switch nodelist.groups="switch,all" xcat1 # chtab node=xcat2 nodehm.power=ipmi nodehm.mgt=ipmi xcat1 # chtab node=xcat3 nodehm.power=ipmi nodehm.mgt=ipmi
networks table, set the
dhcpserver attributes for the
networks 192.168.112.0 and 192.168.17.0, as shown in Listing 10, and
remove the row for the 192.168.122.0 network.
Listing 10. The networks table
xcat1 # chtab net=192.168.112.0 networks.netname=cluster xcat1 # chtab net=192.168.112.0 networks.gateway=192.168.112.1 xcat1 # chtab net=192.168.112.0 networks.dhcpserver=192.168.112.1 xcat1 # chtab net=192.168.17.0 networks.netname=extnet xcat1 # chtab net=192.168.17.0 networks.gateway=192.168.17.1 xcat1 # chtab net=192.168.17.0 networks.tftpserver="" xcat1 # tabdump networks | grep -v "192.168.122.0" > networks.csv xcat1 # tabrestore networks.csv
Add the setup_users script to the
postscripts table, as shown
in Listing 11. The script sets up the users database on the compute nodes
and will be executed after the nodes are booted.
Listing 11. The postscripts table
xcat1 # chtab node=compute postscripts.postscripts=setup_users xcat1 # tabdump postscripts #node,postscripts,postbootscripts,comments,disable "xcatdefaults","syslog,remoteshell,syncfiles","otherpkgs",, "service","servicenode",,, "compute","setup_users",,,
Configure DNS and DHCP
Generate the DNS and DHCP configuration using the xCAT
makedhcp commands. Because the
command requires that the management node be part of only one domain,
temporarily modify the /etc/hosts file, then run
Restore the /etc/hosts file, as shown in Listing 12. In addition, make
sure the /install directory is NFS-exported, as shown in the listing.
Listing 12. The DNS, DHCP and NFS configurations
xcat1 # diff /etc/hosts /etc/hosts.orig 7,8c7,8 < #192.168.17.201 xcat1.extnet < #192.168.17.202 xcat1i.extnet --- > 192.168.17.201 xcat1.extnet > 192.168.17.202 xcat1i.extnet xcat1 # makedns xcat1 # service dhcpd stop xcat1 # rm /var/lib/dhcpd/dhcpd.leases xcat1 # makedhcp -n xcat1 # makedhcp -a xcat1 # cp -p /etc/hosts.orig /etc/hosts xcat1 # host xcat2 xcat2.cluster has address 192.168.112.102 xcat2.cluster mail is handled by 10 xcat2.cluster. xcat1 # grep /install /etc/exports /install *(rw,no_root_squash,sync) xcat1 # showmount -e | grep /install /install *
Set up TORQUE
You can install TORQUE using either binary RPM packages or by building it from source. In this article, the management node runs CentOS V5, and the compute nodes run CentOS V5 or Fedora V10. Because the binary packages for CentOS V5 and Fedora V10 are for different versions of TORQUE and are not compatible, you'll need to do a bit of extra work to build TORQUE from source on Fedora 10.
Components of TORQUE
The TORQUE system includes three daemon processes: the server, the
scheduler, and the job executor. The server process, called
pbs_server, creates and manages jobs and queues and
dispatches jobs for execution to compute nodes, where the job executor
pbs_mom— starts the jobs. To determine
when and where to dispatch the jobs, in each scheduling cycle, the server
examines the pending jobs eligible for execution and communicates with the
pbs_scheduler to find out which jobs are to
be dispatched to which compute nodes.
In this article, the management node xcat1 runs all the TORQUE daemons, while the compute nodes run only the job executor daemon.
Install TORQUE on the management node
The Extra Packages for Enterprise Linux (EPEL) project provides the TORQUE packages. To install the EPEL TORQUE packages, first install the EPEL repository configuration file epel.repo, as shown in Listing 13.
Listing 13. Install the EPEL repository configuration
xcat1 # rpm -Uvh \ http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rp m xcat1 # rpm -qf /etc/yum.repos.d/epel.repo epel-release-5-3
Now, install the TORQUE packages using yum. Before running yum, change the
file /etc/yum.conf to set
keepcache=1 so the TORQUE RPM files
are kept in the yum cache repository after installation. (You will need to
libtorque packages to the
root image later.) Then run the commands shown in Listing 14.
Listing 14. Install the TORQUE packages
xcat1 # yum -q install torque xcat1 # yum -q install torque-server xcat1 # yum -q install torque-scheduler xcat1 # yum -q install torque-mom xcat1 # yum -q install torque-pam.x86_64 xcat1 # yum -q install torque-client xcat1 # yum -q install torque-docs
Verify that the TORQUE RPM files are under /var/cache/yum/epel/packages,
then change the file /etc/yum.conf to set
Set up TORQUE on the management node
Set the TORQUE server name, then install the TORQUE startup script included in xCAT. Next, configure and start the TORQUE services, as shown in Listing 15.
Listing 15. Configure the TORQUE services
xcat1 # more /var/torque/server_name xcat1 xcat1 # cp -p /opt/xcat/share/xcat/netboot/add-on/torque/pbs /etc/init.d xcat1 # chkconfig --level 345 pbs_server on xcat1 # chkconfig --level 345 pbs_sched on xcat1 # chkconfig --level 345 pbs_mom on xcat1 # service pbs start Starting TORQUE Mom: [ OK ] Starting TORQUE Scheduler: [ OK ] Starting TORQUE Server: [ OK ]
The TORQUE system manager,
qmgr is useful for configuring the
behavior of a TORQUE cluster, including queues, job scheduling, and
execution nodes. Run
setup_torque_config, which configures
the cluster using
qmgr, as shown in Listing 16.
Listing 16. Configuring the TORQUE server and queues
xcat1 # setup_torque_config WARNING: this program will overwrite the Torque server configuration Continue ? [y/N] y # Initialize Torque server configuration Shutting down TORQUE Server: [ OK ] Starting TORQUE Server: [ OK ] # Set up Torque server Max open servers: 4 set server scheduling = True set server scheduler_iteration = 60 set server query_other_jobs = True set server default_queue = dque set server acl_hosts = xcat1 # Set up Torque queue Max open servers: 4 create queue dque set queue dque queue_type = Execution set queue dque enabled = True set queue dque started = True # Set up Torque nodes Max open servers: 4 create node xcat1 set node xcat1 properties = management Max open servers: 4 create node xcat2 set node xcat2 properties = centos5 Max open servers: 4 create node xcat3 set node xcat3 properties = centos5
Generate the root images for diskless boot
This section describes how to generate the root image that will be used to provision the servers. As shown in Figure 3, for each supported operating system, the image is built using a repository containing the base packages and another repository containing extra packages, such as those of the TORQUE distribution.
Figure 3. Creating the root image
To create the root image for diskless provisioning:
- Unpack the ISO image of the operating system distribution using the
- Generate a stateless root image for net-booting the operating system
from the management node using the xCAT
- Optionally, modify some files in the stateless image to tune the execution environment. For example, change the /etc/fstab file to mount additional file systems.
- Pack the stateless root image into a compressed file that the
net-booted nodes fetch using the xCAT
The next subsections describe these steps in detail for the CentOS V5 and Fedora V10 operating systems.
Unpack the ISO images
You create the repository for the operating system distribution from a DVD
— or from an ISO image that contains the distribution —
copycds tool. To create the local repository for
the CentOS V5.4 and Fedora V10 distributions, get the ISO images, and run
copycds command, as shown in Listing 17.
Listing 17. Creating the CentOS and Fedora repository
xcat1 # iso=CentOS-5.4-x86_64-bin-DVD.iso xcat1 # repos=centos/5.4/isos xcat1 # wget http://mirrors.se.kernel.org/$repos/$iso xcat1 # copycds $iso Copying media to /install/centos5/x86_64/ Media copy operation successful xcat1 # iso=Fedora-10-x86_64-DVD.iso xcat1 # repos=pub/archive/fedora/linux/releases/10/Fedora xcat1 # wget http://archives.fedoraproject.org/$repos/x86_64/iso/$iso xcat1 # copycds $iso Copying media to /install/fedora10/x86_64/ Media copy operation successful
At this point, the CentOS and Fedora repositories are in the directories
/install/centos5/x86_64 and /install/fedora10/x86_64, respectively; the
linuximage tables are updated with
the image names for
diskfull provisioning. The
copycds tool builds these paths from the
table's installdir key — whose value is
/install— and from the operating system and
architecture attributes defined by the .discinfo file included in the ISO
Add CentOS packages
parted package and the TORQUE client and execution
daemon packages as follows:
- Add the parted package to the package list file compute.pkglist.
- Make the directory in which
genimagelooks for additional packages and copy the TORQUE packages to this directory.
- Create the yum repository metadata with the
- Create the package list file compute.otherpkgs.pkglist for additional
packages to tell
genimagewhat packages to install.
These steps are shown in Listing 18.
Listing 18. Add packages to CentOS
xcat1 # diff /opt/xcat/share/xcat/netboot/centos/compute.pkglist \ /opt/xcat/share/xcat/netboot/centos/compute.pkglist.orig 11,13d10 < vim-minimal < rpm < yum xcat1 # mkdir -p /install/post/otherpkgs/centos5/x86_64 xcat1 # cp -p /var/cache/yum/epel/packages/*torque*.rpm \ /install/post/otherpkgs/centos5/x86_64 xcat1 # createrepo /install/post/otherpkgs/centos5/x86_64 xcat1 # more /opt/xcat/share/xcat/netboot/centos/compute.otherpkgs.pkglist TORQUE-mom
Add Fedora packages
The procedure for adding Fedora packages is similar to that for CentOS packages, except that you must download the RPMs to be added because they are not in the yum cache repository. Complete the steps shown in Listing 19.
Listing 19. Add packages to Fedora
xcat1 # mkdir -p /install/post/otherpkgs/fedora10/x86_64 xcat1 # cd /install/post/otherpkgs/fedora10/x86_64 xcat1 # repos=pub/archive/fedora/linux/releases/10/Everything/x86_64/os/Packages xcat1 # wget \ http://archives.fedoraproject.org/$repos/busybox-anaconda-1.10.3-3.fc10.x86_ 64.rpm \ http://archives.fedoraproject.org/$repos/torque-2.1.10-6.fc10.x86_64.rpm \ http://archives.fedoraproject.org/$repos/libtorque-2.1.10-6.fc10.x86_64.rpm \ http://archives.fedoraproject.org/$repos/torque-mom-2.1.10-6.fc10.x86_64.rpm xcat1 # createrepo /install/post/otherpkgs/fedora10/x86_64 xcat1 # diff /opt/xcat/share/xcat/netboot/fedora/compute.pkglist \ /opt/xcat/share/xcat/netboot/fedora/compute.pkglist.orig 13,14d12 < rpm < yum xcat1 # more /opt/xcat/share/xcat/netboot/fedora/compute.otherpkgs.pkglist busybox-anaconda torque-mom
Generate the root image
genimage xCAT command generates a stateless image that
will be used to provision the compute nodes. The command uses the
repository created with
copycds to generate the root image
for a node with one of the profiles that xCAT supports. For the compute
nodes, use the
compute profile defined in the
genimage command as shown in Listing
20 to create the root images for CentOS and Fedora under the
operating system-dependent root image directories
/install/netboot/fedora10/x86_64/compute, respectively. The command takes
three options: the operating system name, the processor architecture, and
Listing 20. Generating the root images
xcat1 # genimage -i eth0 -o centos5 -p compute os: centos5 profile: compute interface: eth0 Which network drivers will you need? (press enter if you're not sure) [igb,e1000e,e1000,bnx2,tg3] e1000 [ ... ] Complete! xcat1 # genimage -i eth0 -o fedora10 -p compute os: fedora10 profile: compute interface: eth0 Which network drivers will you need? (press enter if you're not sure) [igb,e1000e,e1000,bnx2,tg3] e1000 [ ... ] Complete!
Tune the image
Add mounting of the home directory with NFS to /etc/fstab and run
configure_torque to add the TORQUE configuration files to the
image, as shown in Listing 21.
Listing 21. Customizing the images
xcat1 # diff /install/netboot/centos5/x86_64/compute/rootimg/etc/fstab \ /install/netboot/centos5/x86_64/compute/rootimg/etc/fstab.orig 5d4 < xcat1:/home /home nfs defaults 0 0 xcat1 # diff /install/netboot/fedora10/x86_64/compute/rootimg/etc/fstab \ /install/netboot/fedora10/x86_64/compute/rootimg/etc/fstab.orig 5d4 < xcat1:/home /home nfs defaults 0 0 xcat1 # gen_config_torque /install/netboot/centos5/x86_64/compute/rootimg xcat1 x86_64 Setting up the pbs_mom configuration Completed setting up the pbs_mom configuration xcat1 # gen_config_torque /install/netboot/fedora10/x86_64/compute/rootimg xcat1 x86_64 Setting up the pbs_mom configuration Completed setting up the pbs_mom configuration
Pack the images and boot the nodes
packimage to convert each root image into a compressed
file, as shown in Listing 22. Doing so creates the
file rootimg.gz under the root image directory of each operating system
and updates the tables
with the names of the newly created images.
Listing 22. Packing the root images
xcat1 # packimage -o centos5 -p compute -a x86_64 Packing contents of /install/netboot/centos5/x86_64/compute/rootimg xcat1 # ls /install/netboot/fedora10/x86_64/compute/rootimg.gz /install/netboot/fedora10/x86_64/compute/rootimg.gz xcat1 # packimage -o fedora10 -a x86_64 -p compute Packing contents of /install/netboot/fedora10/x86_64/compute/rootimg xcat1 # ls /install/netboot/centos5/x86_64/compute/rootimg.gz /install/netboot/centos5/x86_64/compute/rootimg.gz
Boot the compute nodes
Boot the nodes in the compute group and check their status, as shown in Listing 23, where the
Listing 23. Booting the nodes
xcat1 # nodeset compute netboot xcat2: netboot centos5-x86_64-compute xcat3: netboot centos5-x86_64-compute xcat1 # rpower compute boot xcat3: on reset xcat2: on reset xcat1 # nodestat compute xcat2: ping netboot centos5-x86_64-compute xcat3: ping netboot centos5-x86_64-compute xcat1 # nodestat compute xcat2: pbs,sshd xcat3: pbs,sshd
Add the correct version of pbs_mom to the Fedora image
The TORQUE job execution daemon pbs_mom, running on the compute nodes, communicates with the TORQUE server daemon, pbs_server, running on the management node. Unfortunately, V2.3.10 of the server, installed from the EPEL repository, is not compatible with V2.1.10 of pbs_mom, installed from the Fedora V10 archive, and V2.3.10 of the binary RPM is not available for Fedora V10. In this subsection, you build V2.3.10 from source on a compute node booted with Fedora V10 and add it to the Fedora V10 root image on the management node.
To build pbs_mom V2.3.10 on Fedora 10 from the source code, use the xCAT postscript mechanism and the provided tool install_torque-mom, as shown in Listing 24. Once pbs_mom is built, repack the Fedora 10 image and remove the postscript.
Listing 24. Building pbs_mom on Fedora and adding it to the image
xcat1 # chtab node=xcat3 postscripts.postscripts=install_torque-mom xcat1 # chtab node=xcat3 nodetype.os=fedora10 xcat1 # nodeset xcat3 netboot xcat3: netboot fedora10-x86_64-compute xcat1 # rpower xcat3 boot xcat3: on reset xcat1 # nodestat xcat3 xcat3: pbs,sshd xcat1 # tabdump postscripts | grep xcat3 xcat1 # packimage -o fedora10 -p compute -a x86_64 Packing contents of /install/netboot/fedora10/x86_64/compute/rootimg xcat1 # rpower xcat3 boot xcat3: on reset xcat1 # ssh xcat3 '/usr/sbin/pbs_mom --version' version: 2.3.10 xcat1 # nodestat xcat3 xcat3: pbs,sshd
Now, all compute nodes are running TORQUE V2.3.10.
Run jobs with TORQUE
At this point, the cluster is ready to run jobs. A job that requires a
certain operating system via the job
nodes attribute will run
only on nodes that have that operating system.
Set the size of the compute nodes
First, set the size of the compute nodes. When the compute nodes are up,
configure TORQUE to have the current number of processors or cores for
each compute node, as shown in Listing 25, where the
shows the configured execution nodes.
Listing 25. Setting the size of the compute nodes
xcat1 # nodels compute xcat2 xcat3 xcat1 # nodes=$(nodels compute) xcat1 # for node in $nodes; \ do \ np=$(psh $node cat /proc/cpuinfo | grep processor | wc -l); \ qmgr -c "set node $node np=$np" \ done xcat1 # pbsnodes | grep -A3 "^xcat$" | grep -v "\-" xcat2 state = free np = 2 properties = centos5 xcat3 state = free np = 2 properties = centos5
Verify that job scheduling and execution works by creating a job script uname.pbs, submitting it, and examining the job output, as shown in Listing 26. Job submission is performed as a non-privileged user.
Listing 26. Running jobs to TORQUE
xcat1 $ id -un gabriel xcat1 $ cd ~/TORQUE/jobs xcat1 $ more uname.pbs rel=/etc/redhat-release if [ -f $rel ]; then cat $rel fi xcat1 $ qsub -l nodes=centos5 uname.pbs 1.xcat1.cluster xcat1 $ more uname.pbs.o1 CentOS release 5.4 (Final)
The previous section showed that TORQUE runs a job requiring the CentOS operating system when there are nodes with CentOS. However, if a job requires an operating system that none of the nodes currently runs, the job won't run. Dynamic provisioning ensures the creation of nodes with the operating system that the jobs require.
The following mechanism is used to manage the execution environments:
- The list of available execution environments includes the names of
every operating system for which a network-booting image is installed
on the management node under /install/netboot, which in this article
- Every job indicates its required execution environment using the job
qsub -l nodes centos5 job.pbs.
- Every node declares the execution environment currently installed by
way of the
propertiesattribute of the node.
- TORQUE schedules a job to a node whose
propertiesattribute matches the value of the
nodesattribute of the job; the provisioning agent makes sure that such a node exists.
properties attribute of a node is initialized when the
node is added to TORQUE and is updated by the provisioning agent when it
re-provisions the node.
Because TORQUE only allows you to submit a job if its
attribute matches the
properties attribute of some execution
node, create a dummy node for each execution environment that the
management node supports. TORQUE also requires that all node names be
resolved to an IP address, so use the IPMIs of the nodes as dummy
execution nodes, as shown in Listing 27.
Listing 27. Dummy Nodes
xcat1 # qmgr -c "create node xcat2i" xcat1 # qmgr -c "set node xcat2i properties=centos5" xcat1 # qmgr -c "create node xcat3i" xcat1 # qmgr -c "set node xcat3i properties=fedora10" xcat1 # pbsnodes | grep -A3 "^xcati" | grep -v "\-" xcat2i state = down np = 1 properties = centos5 xcat3i state = down np = 1 properties = fedora10
The provisioning agent
The provisioning agent performs the following actions:
- Get the list C of compute nodes from xCAT, and for each node n in C, set the time T(n) when n is eligible for provisioning to the current time. At any time t, a node n is eligible for provisioning if t > T(n).
- Get the list of jobs from TORQUE and their required operating systems.
- Determine the list of compute nodes L that need to be provisioned with another operating system in order to satisfy the requirements of a pending job. Only nodes that are eligible for provisioning can be present in L.
- Configure each node n in L as follows:
- Set the operating system OS(n) with which n is to be provisioned.
- Remove n from C.
- For each node n in the list L, carry out
re-provisioning as follows:
- Instruct TORQUE to stop scheduling jobs to n.
- When n becomes free of jobs, instruct xCAT to provision n with OS(n).
- When xCAT reports that n is booted with the new operating system, instruct TORQUE to enable scheduling jobs to n, set T(n) to the current time plus a quiescent time Q, and move n from L to C.
- Return to step 2.
The provisioning agent is implemented by the Perl program prov_agent.pm, which is available from Download. Start the provisioning agent by running this program as the root user.
Simple use case of the agent
Try out dynamic provisioning by submitting the job script shown in Listing 26, while indicating the execution
fedora10. The change in the node configuration as
a result of this job is shown in Listing 28.
Listing 28. Re-provisioning example
xcat1 $ id -un gabriel xcat1 $ pbsnodes | grep -A3 "^xcat$" | grep -v "\-" xcat2 state = free np = 2 properties = centos5 xcat3 state = free np = 2 properties = centos5 xcat1 $ cd ~/TORQUE/jobs xcat1 $ qsub -l nodes=fedora10 uname.pbs 7.xcat1.cluster xcat1 $ pbsnodes | grep -A3 "^xcat$" | grep -v "\-" xcat2 state = free np = 2 properties = fedora10 xcat3 state = free np = 2 properties = centos5 xcat1 $ more uname.pbs.o7 Fedora release 10 (Cambridge)
While the provisioning agent uses simple rules for deciding when to re-provision a node and which node to select for re-provisioning, it illustrates the powerful idea of dynamically changing the "personality" of the compute nodes.
Combining xCAT with TORQUE and a provisioning agent in the management of a cluster creates an adaptive infrastructure in which the compute nodes are dynamically provisioned based on the requirements of the jobs. This adaptive infrastructure enables you to create a private or public cloud service using open source software.
Even though this article has described combining xCAT with the TORQUE workload and resource manager, you can combine xCAT with other open source or commercial workload and resource managers. We chose TORQUE in this article for its wide availability in Linux distributions and its ease of configuration.
- Explore the xCAT Project.
- Get a summary of xCAT features and supported operating systems.
- Learn about the advanced features of xCAT in the xCAT 2 Linux Advanced Cookbook.
- Follow developerWorks on Twitter.