Contents


Dynamic server provisioning with xCAT and TORQUE

Comments

This article describes a solution for building a dynamically provisioned high-performance computing (HPC) cluster system using the Extreme Cloud Administration Toolkit (xCAT) and Tera-scale Open-source Resource and QUEue Manager (TORQUE) open source packages. The xCAT is a leading solution for dynamically provisioning compute, storage, and network resources. TORQUE is a workload and resource-management system that manages batch jobs and compute nodes and schedules the execution of those jobs.

We build a cluster where nodes are provisioned with xCAT and on which batch jobs are managed and executed by TORQUE. On top of xCAT and TORQUE, we build a provisioning agent that makes the cluster adaptive, meaning that the compute nodes of the cluster are dynamically provisioned with the execution environment that the jobs require.

Architecture of the adaptive cluster

The architecture of the dynamic cluster we are building is shown in Figure 1, where the xCAT cluster consists of a management node and several compute nodes. The compute nodes are provisioned by the xCAT server running on the management node. The management node also runs the TORQUE server and scheduler daemons, as well as several services required to manage the compute nodes with xCAT, including DNS, DHCP, TFTP, and NFS.

The compute nodes run jobs that the TORQUE server dispatches and the TORQUE job-execution daemon running on each compute node starts. The provisioning agent examines the workload and the node configuration, and decides which nodes need to be provisioned to provide the execution environment that the jobs require.

Figure 1. The adaptive cluster
Image shows architecture of the dynamic cluster
Image shows architecture of the dynamic cluster

For a small cluster, a single management node can provide the bandwidth required to provision all the compute nodes. For larger clusters, a hierarchical approach is needed, with the management node being connected to two or more service nodes and the compute nodes being provisioned from the service nodes.

For the purpose of this article, consider a small cluster that has one management node —xcat1— and two compute nodes —xcat2 and xcat3— that are connected to xcat1 via the Ethernet switch xcat-switch, as shown in Figure 2. The servers used each have a dual-processor Intel® Xeon® x86_64 architecture, 2 GB of memory, 73 GB of disk capacity, and Ethernet interfaces that support Preboot eXecution Environment (PXE) booting. The management node runs CentOS release 5.4.

Figure 2. Cluster components and networking
Image shows cluster components and networking
Image shows cluster components and networking

The diskless method of provisioning the compute nodes is used, whereby the nodes boot from the management node. Specifically, PXE-based network booting is used.

Configure the management node

Before installing xCAT, configure the management node so xCAT is installed correctly and fetches the correct information about the cluster. This section shows the configuration actions you perform on the management node before installing xCAT.

Set up networking and host definitions

The management node xcat1 is connected to the public network 192.168.17.0 (called extnet) and to the cluster network 192.168.112.0 (called cluster). We use static IP addresses for both interfaces: The public network interface eth0 has the IP address 192.168.17.201; the cluster network interface eth1 has the IP address 192.168.112.1. Listing 1 shows the configured network interfaces, where virbr0 is useful for virtualization, but not for the setup discussed in this article.

Listing 1. Network interfaces
xcat1 # ifconfig -a | egrep -A1 '^[a-z]' | grep -v "\--"
eth0      Link encap:Ethernet  HWaddr 00:11:43:DF:0E:A8
          inet addr:192.168.17.201  Bcast:192.168.17.255  Mask:255.255.255.0
eth1      Link encap:Ethernet  HWaddr 00:11:43:DF:0E:A9
          inet addr:192.168.112.1  Bcast:192.168.112.255  Mask:255.255.255.0
lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
sit0      Link encap:IPv6-in-IPv4
          NOARP  MTU:1480  Metric:1
virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0

The file /etc/sysconfig/network, which defines the host name, and the file /etc/hosts, which defines the local host lookup table, are shown in Listing 2, where the Intelligent Platform Management Interfaces (IPMIs) of the compute nodes are xcat2 and xcat3, respectively. The IPMI is used by xCAT to power-cycle and boot the compute the nodes. Check the local host short and full name using the hostname command, as shown in the listing.

Listing 2. Host definitions
xcat1 # more /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=xcat1

xcat1 # more /etc/hosts
127.0.0.1      localhost.localdomain localhost
::1            localhost6.localdomain6 localhost6
192.168.17.201 xcat1.extnet
192.168.17.202 xcat1i.extnet
192.168.112.1  xcat1.cluster xcat1
192.168.112.100 xcat-switch
192.168.112.102 xcat2
192.168.112.103 xcat3
192.168.112.202 xcat2i
192.168.112.203 xcat3i

xcat1 # hostname -s
xcat1

xcat1 # hostname -f
xcat1.cluster

Set up DNS

The resolver configuration file /etc/resolv.conf, shown in Listing 3, defines as the primary server the management node 192.168.112.1 (xcat1) and an external server as a secondary server. (We set up the name server 192.168.112.1 using xCAT in the "Configure DNS and DHCP" section.) The default setup of the named service on CentOS V5.4 is to use bind-chroot. Because xCAT expects the named service not to use chroot, remove the package bind-chroot, as shown in the listing.

Listing 3. DNS setup
   xcat1 # more /etc/resolv.conf
   search cluster extnet
   nameserver 192.168.112.1
   nameserver 130.236.101.9

   xcat1 # rpm -q bind-chroot
   bind-chroot-9.3.6-4.P1.el5_4.2

   xcat1 # rpm -e bind-chroot

Install the downloaded tools

Extract the ZIP file you downloaded and run the install.sh script to install the programs referred to in the remainder of this article.

Other settings

The Security-enhanced Linux® (SELinux) functionality should be disabled. Additionally, if the tftp-server package is installed, remove it, because xCAT requires the atftp package, which conflicts with tftp-server. To run TORQUE jobs, create a regular user, then NFS-export that user's home directory. Listing 4 shows how to perform these actions.

Listing 4. Additional settings
   xcat1 # rpm -q tftp-server
   tftp-server-0.49-2.el5.centos
   xcat1 # rpm -e tftp-server
   
   xcat1 # grep SELINUX= /etc/sysconfig/selinux | grep -v ^#
   SELINUX=disabled
   
   xcat1 # useradd  -m -s /bin/bash -d /home/gabriel -u 101 -g users gabriel

   xcat1 # grep /home /etc/exports
   /home *(rw,no_root_squash,sync)
   
   xcat1 # exportfs -a
   xcat1 # showmount -e | grep /home
   /home     *

Install and configure xCAT

To make installation easy, xCAT provides RPMs for the common Linux distributions. Because xCAT is a database-driven package, you configure its actions by setting the tables in the xCAT database.

Install xCAT with yum

To install xCAT, install the yum configuration files for the xCAT repository, then run yum to install the xCAT packages and their dependencies, as shown in Listing 5. Two repositories are configured: the core repository and the platform-specific repository.

Listing 5. Install xCAT with yum
  xcat1 # cd /etc/yum.repos.d
  xcat1 # wget http://xcat.sourceforge.net/yum/xcat-core/xCAT-core.repo
  xcat1 # wget 
http://xcat.sourceforge.net/yum/xcat-dep/rh5/x86_64/xCAT-dep.repo
  
  xcat1 # yum clean metadata
  xcat1 # yum install xCAT.x86_64
  [...]
  Complete!
  
  xcat1 # source  /etc/profile.d/xcat.sh

When the installation is complete, the xCAT and the TFTP services should be configured and running. Verify that by running the commands shown in Listing 6.

Listing 6. The xCAT service
  xcat1 # chkconfig --list xcatd
  xcatd           0:off  1:off  2:off  3:on   4:on   5:on   6:off

  xcat1 # service xcatd status
  xCAT service is running

  xcat1 # chkconfig --list tftpd
  tftpd           0:off   1:off   2:off   3:on    4:on    5:on    6:off

  xcat1 # service tftpd status
  atftpd service is running

Install the downloaded tools

Extract the ZIP file you downloaded and run the included install.sh script, which installs the following programs referred to in the remainder of this article:

  • The image configuration tools gen_config_torque (installed in /opt/provisioner/bin) and install_torque-mom (installed in /install/postscripts), which you need to tune the image for TORQUE
  • The setup_torque_config script used to set up the TORQUE queue and execution nodes (installed in /opt/provisioner/bin)
  • The setup_users script you'll run as an xCAT postscript (installed in /install/postscripts)
  • The provisioning agent prov_agent, which performs dynamic provisioning (installed in /opt/provisioner/bin)

Set up the xCAT database tables

xCAT stores the cluster configuration in a database (the default database type is SQLite). As part of the installation process, the tables in the xCAT database are created and initialized. To change them, use the tabedit or chtab command; to view them, use the tabdump command.

This section describes the important settings in the xCAT tables. The complete contents of the tables is available in the Download section.

In the site table, set the dhcpinterfaces attribute to eth1 and make sure the master, domain, nameservers, and forwarders attributes are set as shown in Listing 7.

Listing 7. The site table
xcat1 # chtab key=dhcpinterfaces  site.value=eth1

xcat1 # tabdump site | egrep "dhcp|domain|master|server|forward" | sort
"dhcpinterfaces","eth1",,
"domain","cluster",,
"forwarders","130.236.101.9",,
"master","192.168.112.1",,
"nameservers","192.168.112.1",,

Define the compute node group in the noderes table, define the compute profile — a profile defines the packages included in a root image — in the nodetype table, as shown in Listing 8.

Listing 8. The noderes and nodetype tables
  xcat1 # chtab node="compute"                     \
                  noderes.netboot=pxe              \
                  noderes.tftpserver=192.168.112.1 \
                  noderes.nfsserver=192.168.112.1  \
                  noderes.installnic=eth0          \
                  noderes.primarynic=eth0          \
                  noderes.discoverynics=eth0
                
  xcat1 # chtab node=compute               \
                  nodetype.os=centos5      \
                  nodetype.arch=x86_64     \
                  nodetype.profile=compute \
                  nodetype.nodetype=osi
                  
  xcat1 # chtab node=xcat2 nodetype.os=centos5
  xcat1 # chtab node=xcat3 nodetype.os=centos5

Define the MAC addresses of the nodes in the mac table, then define the nodes to the nodelist table. Configure the hardware management of the nodes in the nodehm table, as shown in Listing 9.

Listing 9. The mac, nodelist, and nodehm tables
  xcat1 # chtab node=xcat2  mac.interface=eth1 mac.mac="00:11:43:df:0f:09"
  xcat1 # chtab node=xcat2i mac.interface=eth1 mac.mac="00:11:43:df:0f:0b"
  xcat1 # chtab node=xcat-switch mac.interface=eth1
mac.mac="00:10:83:8d:5a:42"
  xcat1 # chtab node=xcat3  mac.interface=eth1 mac.mac="00:11:43:df:0d:1d"
  xcat1 # chtab node=xcat3i mac.interface=eth1 mac.mac="00:11:43:df:0d:1f"

  xcat1 # chtab node=xcat2       nodelist.groups="compute,all"
  xcat1 # chtab node=xcat3       nodelist.groups="compute,all"
  xcat1 # chtab node=xcat2i      nodelist.groups="bmc,all"
  xcat1 # chtab node=xcat3i      nodelist.groups="bmc,all"
  xcat1 # chtab node=xcat-switch nodelist.groups="switch,all"

  xcat1 # chtab node=xcat2 nodehm.power=ipmi nodehm.mgt=ipmi
  xcat1 # chtab node=xcat3 nodehm.power=ipmi nodehm.mgt=ipmi

In the networks table, set the netname, gateway, and dhcpserver attributes for the networks 192.168.112.0 and 192.168.17.0, as shown in Listing 10, and remove the row for the 192.168.122.0 network.

Listing 10. The networks table
  xcat1 # chtab net=192.168.112.0  networks.netname=cluster
  xcat1 # chtab net=192.168.112.0  networks.gateway=192.168.112.1
  xcat1 # chtab net=192.168.112.0  networks.dhcpserver=192.168.112.1
  
  xcat1 # chtab net=192.168.17.0   networks.netname=extnet
  xcat1 # chtab net=192.168.17.0   networks.gateway=192.168.17.1
  xcat1 # chtab net=192.168.17.0   networks.tftpserver=""

  xcat1 # tabdump networks | grep -v "192.168.122.0" > networks.csv
  xcat1 # tabrestore networks.csv

Add the setup_users script to the postscripts table, as shown in Listing 11. The script sets up the users database on the compute nodes and will be executed after the nodes are booted.

Listing 11. The postscripts table
  xcat1 # chtab node=compute postscripts.postscripts=setup_users

  xcat1 # tabdump postscripts
  #node,postscripts,postbootscripts,comments,disable
  "xcatdefaults","syslog,remoteshell,syncfiles","otherpkgs",,
  "service","servicenode",,,
  "compute","setup_users",,,

Configure DNS and DHCP

Generate the DNS and DHCP configuration using the xCAT makedns and makedhcp commands. Because the makedns command requires that the management node be part of only one domain, temporarily modify the /etc/hosts file, then run makedns. Restore the /etc/hosts file, as shown in Listing 12. In addition, make sure the /install directory is NFS-exported, as shown in the listing.

Listing 12. The DNS, DHCP and NFS configurations
  xcat1 # diff /etc/hosts /etc/hosts.orig
  7,8c7,8
  < #192.168.17.201 xcat1.extnet
  < #192.168.17.202 xcat1i.extnet
  ---
  > 192.168.17.201 xcat1.extnet
  > 192.168.17.202 xcat1i.extnet
 
  xcat1 # makedns
  
  xcat1 # service dhcpd stop
  xcat1 # rm /var/lib/dhcpd/dhcpd.leases
  xcat1 # makedhcp -n
  xcat1 # makedhcp -a
  
  xcat1 # cp -p /etc/hosts.orig /etc/hosts
 
  xcat1 # host xcat2
  xcat2.cluster has address 192.168.112.102
  xcat2.cluster mail is handled by 10 xcat2.cluster.

  xcat1 # grep /install /etc/exports
  /install *(rw,no_root_squash,sync)
  
  xcat1 # showmount -e | grep /install
  /install  *

Set up TORQUE

You can install TORQUE using either binary RPM packages or by building it from source. In this article, the management node runs CentOS V5, and the compute nodes run CentOS V5 or Fedora V10. Because the binary packages for CentOS V5 and Fedora V10 are for different versions of TORQUE and are not compatible, you'll need to do a bit of extra work to build TORQUE from source on Fedora 10.

Components of TORQUE

The TORQUE system includes three daemon processes: the server, the scheduler, and the job executor. The server process, called pbs_server, creates and manages jobs and queues and dispatches jobs for execution to compute nodes, where the job executor — called pbs_mom— starts the jobs. To determine when and where to dispatch the jobs, in each scheduling cycle, the server examines the pending jobs eligible for execution and communicates with the scheduler daemon pbs_scheduler to find out which jobs are to be dispatched to which compute nodes.

In this article, the management node xcat1 runs all the TORQUE daemons, while the compute nodes run only the job executor daemon.

Install TORQUE on the management node

The Extra Packages for Enterprise Linux (EPEL) project provides the TORQUE packages. To install the EPEL TORQUE packages, first install the EPEL repository configuration file epel.repo, as shown in Listing 13.

Listing 13. Install the EPEL repository configuration
  xcat1 # rpm -Uvh \
    
http://download.fedora.redhat.com/pub/epel/5/i386/epel-release-5-3.noarch.rp
m
    
  xcat1 # rpm -qf /etc/yum.repos.d/epel.repo
  epel-release-5-3

Now, install the TORQUE packages using yum. Before running yum, change the file /etc/yum.conf to set keepcache=1 so the TORQUE RPM files are kept in the yum cache repository after installation. (You will need to add the torque-mom and libtorque packages to the root image later.) Then run the commands shown in Listing 14.

Listing 14. Install the TORQUE packages
  xcat1 # yum  -q install torque
  xcat1 # yum  -q install torque-server
  xcat1 # yum  -q install torque-scheduler
  xcat1 # yum  -q install torque-mom
  xcat1 # yum  -q install torque-pam.x86_64
  xcat1 # yum  -q install torque-client
  xcat1 # yum  -q install torque-docs

Verify that the TORQUE RPM files are under /var/cache/yum/epel/packages, then change the file /etc/yum.conf to set keepcache=0.

Set up TORQUE on the management node

Set the TORQUE server name, then install the TORQUE startup script included in xCAT. Next, configure and start the TORQUE services, as shown in Listing 15.

Listing 15. Configure the TORQUE services
  xcat1 # more /var/torque/server_name
  xcat1 
  
  xcat1 # cp -p /opt/xcat/share/xcat/netboot/add-on/torque/pbs  /etc/init.d
  
  xcat1 # chkconfig --level 345 pbs_server on
  xcat1 # chkconfig --level 345 pbs_sched on
  xcat1 # chkconfig --level 345 pbs_mom on

  xcat1 # service pbs start
  Starting TORQUE Mom:                           [  OK  ]
  Starting TORQUE Scheduler:                     [  OK  ]
  Starting TORQUE Server:                        [  OK  ]

The TORQUE system manager, qmgr is useful for configuring the behavior of a TORQUE cluster, including queues, job scheduling, and execution nodes. Run setup_torque_config, which configures the cluster using qmgr, as shown in Listing 16.

Listing 16. Configuring the TORQUE server and queues
  xcat1 # setup_torque_config
  WARNING: this program will overwrite the Torque server configuration
  Continue ? [y/N] y

  # Initialize Torque server configuration
  Shutting down TORQUE Server:                   [  OK  ]
  Starting TORQUE Server:                        [  OK  ]

  # Set up Torque server
  Max open servers: 4
  set server scheduling = True
  set server scheduler_iteration = 60
  set server query_other_jobs    = True
  set server default_queue       = dque
  set server acl_hosts           = xcat1

  # Set up Torque queue
  Max open servers: 4
  create queue dque
  set queue  dque queue_type = Execution
  set queue  dque enabled    = True
  set queue  dque started    = True

  # Set up Torque nodes
  Max open servers: 4
  create node xcat1
  set node xcat1 properties = management
  Max open servers: 4
  create node xcat2
  set node xcat2 properties = centos5
  Max open servers: 4
  create node xcat3
  set node xcat3 properties = centos5

Generate the root images for diskless boot

This section describes how to generate the root image that will be used to provision the servers. As shown in Figure 3, for each supported operating system, the image is built using a repository containing the base packages and another repository containing extra packages, such as those of the TORQUE distribution.

Figure 3. Creating the root image
Image shows creating the root image
Image shows creating the root image

To create the root image for diskless provisioning:

  1. Unpack the ISO image of the operating system distribution using the xCAT copycds tool.
  2. Generate a stateless root image for net-booting the operating system from the management node using the xCAT genimage tool.
  3. Optionally, modify some files in the stateless image to tune the execution environment. For example, change the /etc/fstab file to mount additional file systems.
  4. Pack the stateless root image into a compressed file that the net-booted nodes fetch using the xCAT packimage tool.

The next subsections describe these steps in detail for the CentOS V5 and Fedora V10 operating systems.

Unpack the ISO images

You create the repository for the operating system distribution from a DVD — or from an ISO image that contains the distribution — using the copycds tool. To create the local repository for the CentOS V5.4 and Fedora V10 distributions, get the ISO images, and run the copycds command, as shown in Listing 17.

Listing 17. Creating the CentOS and Fedora repository
  xcat1 # iso=CentOS-5.4-x86_64-bin-DVD.iso
  xcat1 # repos=centos/5.4/isos
  xcat1 # wget http://mirrors.se.kernel.org/$repos/$iso

  xcat1 # copycds $iso
  Copying media to /install/centos5/x86_64/
  Media copy operation successful

  
  xcat1 # iso=Fedora-10-x86_64-DVD.iso
  xcat1 # repos=pub/archive/fedora/linux/releases/10/Fedora
  xcat1 # wget http://archives.fedoraproject.org/$repos/x86_64/iso/$iso
    
  xcat1 # copycds $iso
  Copying media to /install/fedora10/x86_64/
  Media copy operation successful

At this point, the CentOS and Fedora repositories are in the directories /install/centos5/x86_64 and /install/fedora10/x86_64, respectively; the oosimage and linuximage tables are updated with the image names for diskfull provisioning. The copycds tool builds these paths from the site table's installdir key — whose value is /install— and from the operating system and architecture attributes defined by the .discinfo file included in the ISO image.

Add CentOS packages

Add the parted package and the TORQUE client and execution daemon packages as follows:

  1. Add the parted package to the package list file compute.pkglist.
  2. Make the directory in which genimage looks for additional packages and copy the TORQUE packages to this directory.
  3. Create the yum repository metadata with the createrepo command.
  4. Create the package list file compute.otherpkgs.pkglist for additional packages to tell genimage what packages to install.

These steps are shown in Listing 18.

Listing 18. Add packages to CentOS
  xcat1 # diff /opt/xcat/share/xcat/netboot/centos/compute.pkglist      \
               /opt/xcat/share/xcat/netboot/centos/compute.pkglist.orig
  11,13d10
  < vim-minimal
  < rpm
  < yum 

  xcat1 # mkdir -p /install/post/otherpkgs/centos5/x86_64
  
  xcat1 # cp -p  /var/cache/yum/epel/packages/*torque*.rpm   \
                 /install/post/otherpkgs/centos5/x86_64
     
  xcat1 # createrepo /install/post/otherpkgs/centos5/x86_64
  
  xcat1 # more /opt/xcat/share/xcat/netboot/centos/compute.otherpkgs.pkglist
  TORQUE-mom

Add Fedora packages

The procedure for adding Fedora packages is similar to that for CentOS packages, except that you must download the RPMs to be added because they are not in the yum cache repository. Complete the steps shown in Listing 19.

Listing 19. Add packages to Fedora
  xcat1 # mkdir -p  /install/post/otherpkgs/fedora10/x86_64
  xcat1 # cd  /install/post/otherpkgs/fedora10/x86_64
  
  xcat1 # 
repos=pub/archive/fedora/linux/releases/10/Everything/x86_64/os/Packages

  xcat1 # wget  \
   
http://archives.fedoraproject.org/$repos/busybox-anaconda-1.10.3-3.fc10.x86_
64.rpm \
   http://archives.fedoraproject.org/$repos/torque-2.1.10-6.fc10.x86_64.rpm
\
   
http://archives.fedoraproject.org/$repos/libtorque-2.1.10-6.fc10.x86_64.rpm
\
   
http://archives.fedoraproject.org/$repos/torque-mom-2.1.10-6.fc10.x86_64.rpm
 
  xcat1 # createrepo  /install/post/otherpkgs/fedora10/x86_64
  
  xcat1 # diff /opt/xcat/share/xcat/netboot/fedora/compute.pkglist \
               /opt/xcat/share/xcat/netboot/fedora/compute.pkglist.orig
  13,14d12
  < rpm
  < yum 
    
  xcat1 # more /opt/xcat/share/xcat/netboot/fedora/compute.otherpkgs.pkglist
  busybox-anaconda
  torque-mom

Generate the root image

The genimage xCAT command generates a stateless image that will be used to provision the compute nodes. The command uses the repository created with copycds to generate the root image for a node with one of the profiles that xCAT supports. For the compute nodes, use the compute profile defined in the nodetype table.

Run the genimage command as shown in Listing 20 to create the root images for CentOS and Fedora under the operating system-dependent root image directories /install/netboot/centos5/x86_64/compute and /install/netboot/fedora10/x86_64/compute, respectively. The command takes three options: the operating system name, the processor architecture, and the profile.

Listing 20. Generating the root images
  xcat1 # genimage  -i eth0  -o centos5  -p compute
  os: centos5
  profile: compute
  interface: eth0
  Which network drivers will you need? (press enter if you're not sure)
  [igb,e1000e,e1000,bnx2,tg3] e1000
  [ ... ]
  Complete!
  
  xcat1 #  genimage  -i eth0    -o fedora10  -p compute
  os: fedora10
  profile: compute
  interface: eth0
  Which network drivers will you need? (press enter if you're not sure)
  [igb,e1000e,e1000,bnx2,tg3] e1000
  [ ... ]
  Complete!

Tune the image

Add mounting of the home directory with NFS to /etc/fstab and run configure_torque to add the TORQUE configuration files to the image, as shown in Listing 21.

Listing 21. Customizing the images
  xcat1 # diff /install/netboot/centos5/x86_64/compute/rootimg/etc/fstab \
               
/install/netboot/centos5/x86_64/compute/rootimg/etc/fstab.orig
  5d4
  < xcat1:/home /home nfs     defaults       0 0

  xcat1 # diff /install/netboot/fedora10/x86_64/compute/rootimg/etc/fstab \
               
/install/netboot/fedora10/x86_64/compute/rootimg/etc/fstab.orig
  5d4
  < xcat1:/home /home nfs     defaults       0 0


  xcat1 # gen_config_torque /install/netboot/centos5/x86_64/compute/rootimg
xcat1 x86_64 
  Setting up the pbs_mom configuration
  Completed setting up the pbs_mom configuration

  xcat1 # gen_config_torque /install/netboot/fedora10/x86_64/compute/rootimg
xcat1 x86_64
  Setting up the pbs_mom configuration
  Completed setting up the pbs_mom configuration

Pack the images and boot the nodes

Run packimage to convert each root image into a compressed file, as shown in Listing 22. Doing so creates the file rootimg.gz under the root image directory of each operating system and updates the tables osimage and linuximage with the names of the newly created images.

Listing 22. Packing the root images
  xcat1 # packimage -o centos5 -p compute -a x86_64
  Packing contents of /install/netboot/centos5/x86_64/compute/rootimg

  xcat1 # ls /install/netboot/fedora10/x86_64/compute/rootimg.gz
  /install/netboot/fedora10/x86_64/compute/rootimg.gz


  xcat1 # packimage -o fedora10  -a x86_64  -p compute
  Packing contents of /install/netboot/fedora10/x86_64/compute/rootimg

  xcat1 # ls /install/netboot/centos5/x86_64/compute/rootimg.gz
  /install/netboot/centos5/x86_64/compute/rootimg.gz

Boot the compute nodes

Boot the nodes in the compute group and check their status, as shown in Listing 23, where the nodeset command updates the nodetype, bootparams, and chain tables.

Listing 23. Booting the nodes
  xcat1 # nodeset compute netboot
  xcat2: netboot centos5-x86_64-compute
  xcat3: netboot centos5-x86_64-compute
  
  xcat1 # rpower compute boot
  xcat3: on reset
  xcat2: on reset  

  xcat1 # nodestat compute
  xcat2: ping netboot centos5-x86_64-compute
  xcat3: ping netboot centos5-x86_64-compute

  xcat1 # nodestat compute
  xcat2: pbs,sshd
  xcat3: pbs,sshd

Add the correct version of pbs_mom to the Fedora image

The TORQUE job execution daemon pbs_mom, running on the compute nodes, communicates with the TORQUE server daemon, pbs_server, running on the management node. Unfortunately, V2.3.10 of the server, installed from the EPEL repository, is not compatible with V2.1.10 of pbs_mom, installed from the Fedora V10 archive, and V2.3.10 of the binary RPM is not available for Fedora V10. In this subsection, you build V2.3.10 from source on a compute node booted with Fedora V10 and add it to the Fedora V10 root image on the management node.

To build pbs_mom V2.3.10 on Fedora 10 from the source code, use the xCAT postscript mechanism and the provided tool install_torque-mom, as shown in Listing 24. Once pbs_mom is built, repack the Fedora 10 image and remove the postscript.

Listing 24. Building pbs_mom on Fedora and adding it to the image
    xcat1 # chtab node=xcat3 postscripts.postscripts=install_torque-mom
    xcat1 # chtab node=xcat3 nodetype.os=fedora10

    xcat1 # nodeset xcat3 netboot
    xcat3: netboot fedora10-x86_64-compute

    xcat1 # rpower xcat3 boot
    xcat3: on reset
       
    xcat1 # nodestat xcat3
    xcat3: pbs,sshd
    
    xcat1 # tabdump postscripts | grep xcat3
    
    xcat1 # packimage -o fedora10  -p compute -a x86_64
    Packing contents of /install/netboot/fedora10/x86_64/compute/rootimg

    xcat1 # rpower xcat3 boot
    xcat3: on reset
    
    xcat1 # ssh xcat3 '/usr/sbin/pbs_mom --version'
    version: 2.3.10

    xcat1 # nodestat xcat3
    xcat3: pbs,sshd

Now, all compute nodes are running TORQUE V2.3.10.

Run jobs with TORQUE

At this point, the cluster is ready to run jobs. A job that requires a certain operating system via the job nodes attribute will run only on nodes that have that operating system.

Set the size of the compute nodes

First, set the size of the compute nodes. When the compute nodes are up, configure TORQUE to have the current number of processors or cores for each compute node, as shown in Listing 25, where the pbsnodes shows the configured execution nodes.

Listing 25. Setting the size of the compute nodes
  xcat1 # nodels compute
  xcat2
  xcat3
  
  xcat1 # nodes=$(nodels compute)
  
  xcat1 # for node in $nodes;                          \
  do                                                   \
       np=$(psh $node cat /proc/cpuinfo | grep processor | wc -l); \
       qmgr -c "set node $node np=$np"                 \
  done     
  
  xcat1 # pbsnodes | grep -A3 "^xcat[23]$" | grep -v "\-"
  xcat2
     state = free
     np = 2
     properties = centos5
  xcat3
     state = free
     np = 2
     properties = centos5

Run jobs

Verify that job scheduling and execution works by creating a job script uname.pbs, submitting it, and examining the job output, as shown in Listing 26. Job submission is performed as a non-privileged user.

Listing 26. Running jobs to TORQUE
  xcat1 $ id -un
  gabriel  
 
  xcat1 $ cd ~/TORQUE/jobs
 
  xcat1 $ more uname.pbs
  rel=/etc/redhat-release
  if [ -f $rel ]; then
    cat $rel
  fi       

  xcat1 $ qsub -l nodes=centos5 uname.pbs
  1.xcat1.cluster 

  xcat1 $ more uname.pbs.o1
  CentOS release 5.4 (Final)

Dynamic provisioning

The previous section showed that TORQUE runs a job requiring the CentOS operating system when there are nodes with CentOS. However, if a job requires an operating system that none of the nodes currently runs, the job won't run. Dynamic provisioning ensures the creation of nodes with the operating system that the jobs require.

The following mechanism is used to manage the execution environments:

  • The list of available execution environments includes the names of every operating system for which a network-booting image is installed on the management node under /install/netboot, which in this article are centos5 and fedora10.
  • Every job indicates its required execution environment using the job node attribute—for example, qsub -l nodes centos5 job.pbs.
  • Every node declares the execution environment currently installed by way of the properties attribute of the node.
  • TORQUE schedules a job to a node whose properties attribute matches the value of the nodes attribute of the job; the provisioning agent makes sure that such a node exists.

The properties attribute of a node is initialized when the node is added to TORQUE and is updated by the provisioning agent when it re-provisions the node.

Because TORQUE only allows you to submit a job if its nodes attribute matches the properties attribute of some execution node, create a dummy node for each execution environment that the management node supports. TORQUE also requires that all node names be resolved to an IP address, so use the IPMIs of the nodes as dummy execution nodes, as shown in Listing 27.

Listing 27. Dummy Nodes
  xcat1 # qmgr -c "create node xcat2i"
  xcat1 # qmgr -c "set node xcat2i properties=centos5"

  xcat1 # qmgr -c "create node xcat3i"
  xcat1 # qmgr -c "set node xcat3i properties=fedora10"

  xcat1 # pbsnodes | grep -A3 "^xcat[23]i" | grep -v "\-"
  xcat2i
     state = down 
     np = 1
     properties = centos5
  xcat3i
     state = down
     np = 1
     properties = fedora10

The provisioning agent

The provisioning agent performs the following actions:

  1. Get the list C of compute nodes from xCAT, and for each node n in C, set the time T(n) when n is eligible for provisioning to the current time. At any time t, a node n is eligible for provisioning if t > T(n).
  2. Get the list of jobs from TORQUE and their required operating systems.
  3. Determine the list of compute nodes L that need to be provisioned with another operating system in order to satisfy the requirements of a pending job. Only nodes that are eligible for provisioning can be present in L.
  4. Configure each node n in L as follows:
    1. Set the operating system OS(n) with which n is to be provisioned.
    2. Remove n from C.
  5. For each node n in the list L, carry out re-provisioning as follows:
    1. Instruct TORQUE to stop scheduling jobs to n.
    2. When n becomes free of jobs, instruct xCAT to provision n with OS(n).
    3. When xCAT reports that n is booted with the new operating system, instruct TORQUE to enable scheduling jobs to n, set T(n) to the current time plus a quiescent time Q, and move n from L to C.
  6. Return to step 2.

The provisioning agent is implemented by the Perl program prov_agent.pm, which is available from Download. Start the provisioning agent by running this program as the root user.

Simple use case of the agent

Try out dynamic provisioning by submitting the job script shown in Listing 26, while indicating the execution environment fedora10. The change in the node configuration as a result of this job is shown in Listing 28.

Listing 28. Re-provisioning example
  xcat1 $ id -un
  gabriel  
 
  xcat1 $ pbsnodes | grep -A3 "^xcat[23]$" | grep -v "\-"
  xcat2
     state = free
     np = 2
     properties = centos5
  xcat3
     state = free
     np = 2
     properties = centos5
     
  xcat1 $ cd ~/TORQUE/jobs
  xcat1 $ qsub -l nodes=fedora10  uname.pbs
  7.xcat1.cluster
                   
  xcat1 $ pbsnodes | grep -A3 "^xcat[23]$" | grep -v "\-"
  xcat2
     state = free
     np = 2
     properties = fedora10
  xcat3
     state = free
     np = 2
     properties = centos5
      
  xcat1 $ more uname.pbs.o7
  Fedora release 10 (Cambridge)

While the provisioning agent uses simple rules for deciding when to re-provision a node and which node to select for re-provisioning, it illustrates the powerful idea of dynamically changing the "personality" of the compute nodes.

Conclusion

Combining xCAT with TORQUE and a provisioning agent in the management of a cluster creates an adaptive infrastructure in which the compute nodes are dynamically provisioned based on the requirements of the jobs. This adaptive infrastructure enables you to create a private or public cloud service using open source software.

Even though this article has described combining xCAT with the TORQUE workload and resource manager, you can combine xCAT with other open source or commercial workload and resource managers. We chose TORQUE in this article for its wide availability in Linux distributions and its ease of configuration.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Open source
ArticleID=481952
ArticleTitle=Dynamic server provisioning with xCAT and TORQUE
publish-date=04132010