Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

Advantages of openMosix on IBM xSeries, Part 3

Use networked Linux systems to solve your computing challenges

Daniel Robbins (drobbins@gentoo.org), President/CEO, Gentoo Technologies, Inc. (Courtesy Intel Corporation)
Daniel Robbins is the President/CEO of Gentoo Technologies, Inc. Residing in Albuquerque, New Mexico, he is the creator of Gentoo Linux, an advanced Linux for the PC, and the Portage system, a next-generation ports system for Linux. He writes articles, tutorials, and tips for the developerWorks Linux zone and has also served as a Contributing Author for the Macmillan books Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife, Mary, and daughter, Hadassah. You can contact him at drobbins@gentoo.org.

Summary:  By the end of this three-part series, you'll have your own openMosix mini-cluster up and running and will be ready to use it effectively to accelerate your computing tasks. In Part 1, you get a gentle introduction to the current clustering technologies available for Linux and an introduction to openMosix. In Part 2, you got a fully-functional openMosix cluster configured and running. Finally, in this installment, you'll see some ways to use openMosix to tackle computing challenges.

Date:  10 Feb 2005
Level:  Introductory

Activity:  1851 views
Comments:  

Introduction

In this three-part series, I’ve been introducing you to a clustering technology for Linux called openMosix. As you may know, clustering technologies allow two or more networked Linux systems (called "nodes") to combine their computing resources to solve challenges faster than would be possible if they were tackling the problem on their own. Choice of hardware is flexible of course, and openMosix is not limited to any single hardware platform, but clusters built on IBM xSeries servers running Intel® XeonTM processors have some unique advantages. Making use of performance-enhancing technologies such as Intel's Hyper-Threading Technology, now supported under Linux, improves the performance of multi-threaded applications by allowing a single Xeon processor to appear to the operating system as two virtual processors. By taking advantage of Hyper-Threading, you benefit from having multiple physical and/or virtual processors, and also enjoy the benefits of openMosix itself. As this series progresses, I'll guide you through the process of setting up your own openMosix cluster. By the end of the series, you'll have your own openMosix mini-cluster up and running and will be ready to use it to effectively accelerate your computing tasks.

In this third and final article, I'll show you various ways of using openMosix to tackle computing challenges. As an example openMosix application, I'll use FLAC (a free lossless audio encoder) to compress several hundred megabytes of CD-quality audio data. By following along, you'll learn how to use your own openMosix cluster efficiently and effectively.


Audio encoding with OpenMosix

Now we're ready to put our openMosix cluster to work on a real computing task. For this article, I've decided to demonstrate openMosix's capabilities by using FLAC, a free lossless audio encoder. FLAC is a program that allows you to compress digital audio bitstreams so that they take up less space on disk.

However, unlike MPEG 1 Layer 3 encoding, FLAC is "lossless"—no data is lost in the encoding process. That means that when you uncompress your FLAC-encoded data, you'll end up with a pristine digital bitstream rather than a digitally degraded version of your original data. FLAC allows most music data to be compressed to around 65% of original size, and often compresses even smaller.

Certain types of audio data can be compressed even more dramatically.

There are several reasons why audio and video encoding applications are an,excellent match for openMosix. For one, they are generally very CPU-intensive, and have significant but not excessive I/O demands. Because the encoding processes are CPU-intensive, we'll be able to see a big improvement as openMosix migrates these processes to various nodes in our cluster. In addition, FLAC's I/O demands will probably not overwhelm our LAN, allowing our processes to execute on remote nodes in an efficient manner. In combination, these traits will allow our audio encoding performance to scale nearly linearly with the number of nodes in our cluster.

In addition to the CPU and I/O factors, audio and video encoding jobs typically run for at least several minutes, and that gives openMosix plenty of opportunity to migrate the encoding processes to the machines where they can execute most quickly. In contrast, if we needed to run thousands of processes that only had a lifetime of about 2 seconds apiece, openMosix's process migration would never kick in. Why? Because the cost of migrating such short-lived processes to a remote system outweighs the benefits of doing so.

Fortunately for us, openMosix is smart enough to realize this and does an excellent job of only migrating processes in cases where doing so will result in a tangible peformance gain. In general, a process isn’t considered for migration until it is at least 5 seconds old. The exact time before a process is considered for migration depends upon the performance of your cluster.


Installing FLAC

Now it's time to get FLAC up and running. To do this, head over to http://flac.sourceforge.net and download the latest FLAC sources, which are found under the "download" link. Then, install FLAC onto a single node in your cluster as follows:

# tar xzvf flac-1.0.2-src.tar.gz
# cd flac-1.0.2
# ./configure
# make
# make install

If building fails due to a missing "docbook2man," simply install docbook2man or edit the Makefile and comment out the "man" Makefile target. After the "make install" completes, FLAC is installed and ready for use. Again, note that you only need to install FLAC on one node in your cluster—yet another way that openMosix makes our life easier.

Before we get to the audio encoding demo, let's consider how we'd approach the task of distributing a bunch of FLAC processes across our machines if we didn't have a transparent process migration technology like openMosix. For one, we'd need to install and maintain FLAC on every node in our cluster, which could be time consuming. In addition, we'd also need some way to ensure that our audio data is accessible to all our machines, probably by using and configuring something like NFS. Then, we could start a FLAC process on each machine, sit back and watch FLAC encode our audio data in parallel. On second thought, maybe we couldn't exactly "sit back"—if one machine finished its work ahead of the others, we'd need to move some of our workload to the idle machines. Or, if one of our machines started performing some other demanding task, it's possible that our FLAC encoding process on that machine would grind to a near halt, and we'd need to manually move the workload to another machine.

All that unnecessary hand-holding is neither fun nor trivial, and fortunately for us, with openMosix, all that hand-holding and maintenance is also unnecessary.

Also, depending on the machines being used in our cluster, it may not be possible to install FLAC on every node. Consider a situation where your cluster is composed of a bunch of dual-use systems, such as office PCs. In that case, manually installing FLAC on each system may be tricky. You'd need to get access to each machine and log in as root to install FLAC, and that may not be something that the primary user of the machine would want you to do.

In contrast, openMosix is an ideal fit for these types of dual-use clusters because the openMosix process migration technology does all the legwork for you.

Thanks to openMosix, you can safely and securely take advantage of a bunch of office PCs—all they need to have is an openMosix-enabled kernel. Because of this, openMosix allows clusters to be set up in environments where it may not have been practical before.


Audio encoding using openMosix

Now back to our audio encoding project. I've configured my small openMosix cluster so that it's ready to start crunching away. My mini-openMosix cluster has two nodes: inventor, my desktop machine (900Mhz) and sidekick (650Mhz). To begin my tests, I created a /tmp/flac directory on inventor and placed approximately 560Mb of CD-quality audio data in this directory, in the form of .wav files:

# ls *.wav
track01.cdda.wav  track05.cdda.wav  track09.cdda.wav  track13.cdda.wav
track02.cdda.wav  track06.cdda.wav  track10.cdda.wav
track03.cdda.wav  track07.cdda.wav  track11.cdda.wav
track04.cdda.wav  track08.cdda.wav  track12.cdda.wav

In order to have a good feel for the power of your cluster, it's a good idea to perform some number-crunching tests using only a single node in your cluster. Then, you'll be able to determine how good a job openMosix is doing by comparing one-node performance to multi-node performance. In that spirit, I decided to FLAC-encode all the audio data using *only* inventor. To do this, I typed the following shell mini-script at the bash prompt:

# for x in *.wav
do
   flac -8 $x
done

This ad-hoc script tells bash to run "flac" to encode each audio track in succession. Using this script, no more than one process will run at a time. And because inventor happens to be my fastest node, the currently running encoding process runs locally. OpenMosix realizes that the process should be kept right where it is for maximum performance. Note that if I wanted to explicitly disable process migration for these processes, I would type "runhome flac -8 $x" on line 3 instead. After this mini-script completes, I had a bunch of ".flac" files in my current directory, alongside my ".wav" files.

Using a single node (inventor), the encoding of 560Mb of audio data completed in 13 minutes and 13 seconds.

Now that I had a baseline single-node performance number, I removed all the new *.flac files and decided to start all my FLAC encoding processes at once. By doing so, openMosix would certainly distribute these processes throughout my cluster (of 2 nodes!) so that each flac process could take advantage of my cluster's CPU horsepower. To start all flac processes in parallel, I typed:

# for x in *.wav
do
  flac -8 $x &
done

Thanks to the addition of a "&" all the encoding tasks started immediately. Using "mmon," I watched the load on node 1 shoot up to around 14. However, after several seconds, the load on node 1 started dropping and node 2's load started rising—processes were migrating! Within 30 seconds or so, the CPU load was evenly distributed among my two nodes as FLAC chugged away. This time, my 560Mb of audio data was encoded in 8 minutes and 12 seconds—62% of my original time, and quite good considering that sidekick is significantly slower than inventor. In fact, 8 minutes and 12 seconds is very close to the ideal of a perfect linear scaling of performance (which would have resulted in a completion time of probably a bit under 8 minutes.) If you have more than two nodes in your cluster or even just two equally-powered nodes, then you can expect to see even greater performance improvements than the modest 62% of baseline performance improvement I experienced on my openMosix cluster.


Remote execution overhead

I decided to take a look at exactly how much overhead was involved in running an encoding process away from its home node. To do this, I forced all encoding processes to run on sidekick by using the following script:

# for x in *.wav
do
  mosrun -2 flac -8 $x &
done

The "mosrun -2" prefix causes the flac processes to run on node 2, which is sidekick. Since I executed this script on inventor, all the flac processes will be running remotely and will be relying on openMosix to ferry audio data back and forth over the network on their behalf. As expected, sidekick completed the encoding jobs more slowly than inventor, taking 19 minutes and 3 seconds to complete.

Next, I physically copied my ".wav" files to sidekick, installed flac on sidekick, and ran flac on sidekick with migration disabled (by using the "runhome" command). I did this so that I could compare the difference between running all the processes on sidekick via openMosix process migration as compared to running all the processes on sidekick the old-fashioned manual way.

Run directly on sidekick, the flac processes completed in 18 minutes and 19 seconds. Based on this and the previous timing data, I calculated that running flac remotely via openMosix incurs a 4% remote execution overhead—a very reasonable number! Note that I'm using full-duplex Fast Ethernet; if you're using Gigabit Ethernet, then your remote execution overhead will probably be significantly lower than mine.

As you can see, openMosix is a powerful solution for intelligently distributing work across a cluster of Linux machines. In best-case scenarios, openMosix scales almost linearly with the CPU horsepower of the cluster, and openMosix has a very low remote execution overhead to boot. I hope you've enjoyed this series; most of all, I hope that you enjoy the benefits of putting openMosix to work for you!


Resources

About the author

Daniel Robbins is the President/CEO of Gentoo Technologies, Inc. Residing in Albuquerque, New Mexico, he is the creator of Gentoo Linux, an advanced Linux for the PC, and the Portage system, a next-generation ports system for Linux. He writes articles, tutorials, and tips for the developerWorks Linux zone and has also served as a Contributing Author for the Macmillan books Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife, Mary, and daughter, Hadassah. You can contact him at drobbins@gentoo.org.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your developerWorks profile is displayed to the public, but you may edit the information at any time. Your first name, last name (unless you choose to hide them), and display name will accompany the content that you post.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Linux
ArticleID=13028
ArticleTitle=Advantages of openMosix on IBM xSeries, Part 3
publish-date=02102005
author1-email=drobbins@gentoo.org
author1-email-cc=

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

For articles in technology zones (such as Java technology, Linux, Open source, XML), Popular tags shows the top tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), Popular tags shows the top tags for just that product zone.

For articles in technology zones (such as Java technology, Linux, Open source, XML), My tags shows your tags for all technology zones. For articles in product zones (such as Info Mgmt, Rational, WebSphere), My tags shows your tags for just that product zone.

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers