In this three-part series, I’ve been introducing you to a clustering technology for Linux called openMosix. As you may know, clustering technologies allow two or more networked Linux systems (called "nodes") to combine their computing resources to solve challenges faster than would be possible if they were tackling the problem on their own. Choice of hardware is flexible of course, and openMosix is not limited to any single hardware platform, but clusters built on IBM xSeries servers running Intel® XeonTM processors have some unique advantages. Making use of performance-enhancing technologies such as Intel's Hyper-Threading Technology, now supported under Linux, improves the performance of multi-threaded applications by allowing a single Xeon processor to appear to the operating system as two virtual processors. By taking advantage of Hyper-Threading, you benefit from having multiple physical and/or virtual processors, and also enjoy the benefits of openMosix itself. As this series progresses, I'll guide you through the process of setting up your own openMosix cluster. By the end of the series, you'll have your own openMosix mini-cluster up and running and will be ready to use it to effectively accelerate your computing tasks.
In this third and final article, I'll show you various ways of using openMosix to tackle computing challenges. As an example openMosix application, I'll use FLAC (a free lossless audio encoder) to compress several hundred megabytes of CD-quality audio data. By following along, you'll learn how to use your own openMosix cluster efficiently and effectively.
Now we're ready to put our openMosix cluster to work on a real computing task. For this article, I've decided to demonstrate openMosix's capabilities by using FLAC, a free lossless audio encoder. FLAC is a program that allows you to compress digital audio bitstreams so that they take up less space on disk.
However, unlike MPEG 1 Layer 3 encoding, FLAC is "lossless"—no data is lost in the encoding process. That means that when you uncompress your FLAC-encoded data, you'll end up with a pristine digital bitstream rather than a digitally degraded version of your original data. FLAC allows most music data to be compressed to around 65% of original size, and often compresses even smaller.
Certain types of audio data can be compressed even more dramatically.
There are several reasons why audio and video encoding applications are an,excellent match for openMosix. For one, they are generally very CPU-intensive, and have significant but not excessive I/O demands. Because the encoding processes are CPU-intensive, we'll be able to see a big improvement as openMosix migrates these processes to various nodes in our cluster. In addition, FLAC's I/O demands will probably not overwhelm our LAN, allowing our processes to execute on remote nodes in an efficient manner. In combination, these traits will allow our audio encoding performance to scale nearly linearly with the number of nodes in our cluster.
In addition to the CPU and I/O factors, audio and video encoding jobs typically run for at least several minutes, and that gives openMosix plenty of opportunity to migrate the encoding processes to the machines where they can execute most quickly. In contrast, if we needed to run thousands of processes that only had a lifetime of about 2 seconds apiece, openMosix's process migration would never kick in. Why? Because the cost of migrating such short-lived processes to a remote system outweighs the benefits of doing so.
Fortunately for us, openMosix is smart enough to realize this and does an excellent job of only migrating processes in cases where doing so will result in a tangible peformance gain. In general, a process isn’t considered for migration until it is at least 5 seconds old. The exact time before a process is considered for migration depends upon the performance of your cluster.
Now it's time to get FLAC up and running. To do this, head over to http://flac.sourceforge.net and download the latest FLAC sources, which are found under the "download" link. Then, install FLAC onto a single node in your cluster as follows:
# tar xzvf flac-1.0.2-src.tar.gz # cd flac-1.0.2 # ./configure # make # make install |
If building fails due to a missing "docbook2man," simply install docbook2man or edit the Makefile and comment out the "man" Makefile target. After the "make install" completes, FLAC is installed and ready for use. Again, note that you only need to install FLAC on one node in your cluster—yet another way that openMosix makes our life easier.
Before we get to the audio encoding demo, let's consider how we'd approach the task of distributing a bunch of FLAC processes across our machines if we didn't have a transparent process migration technology like openMosix. For one, we'd need to install and maintain FLAC on every node in our cluster, which could be time consuming. In addition, we'd also need some way to ensure that our audio data is accessible to all our machines, probably by using and configuring something like NFS. Then, we could start a FLAC process on each machine, sit back and watch FLAC encode our audio data in parallel. On second thought, maybe we couldn't exactly "sit back"—if one machine finished its work ahead of the others, we'd need to move some of our workload to the idle machines. Or, if one of our machines started performing some other demanding task, it's possible that our FLAC encoding process on that machine would grind to a near halt, and we'd need to manually move the workload to another machine.
All that unnecessary hand-holding is neither fun nor trivial, and fortunately for us, with openMosix, all that hand-holding and maintenance is also unnecessary.
Also, depending on the machines being used in our cluster, it may not be possible to install FLAC on every node. Consider a situation where your cluster is composed of a bunch of dual-use systems, such as office PCs. In that case, manually installing FLAC on each system may be tricky. You'd need to get access to each machine and log in as root to install FLAC, and that may not be something that the primary user of the machine would want you to do.
In contrast, openMosix is an ideal fit for these types of dual-use clusters because the openMosix process migration technology does all the legwork for you.
Thanks to openMosix, you can safely and securely take advantage of a bunch of office PCs—all they need to have is an openMosix-enabled kernel. Because of this, openMosix allows clusters to be set up in environments where it may not have been practical before.
Audio encoding using openMosix
Now back to our audio encoding project. I've configured my small openMosix cluster so that it's ready to start crunching away. My mini-openMosix cluster has two nodes: inventor, my desktop machine (900Mhz) and sidekick (650Mhz). To begin my tests, I created a /tmp/flac directory on inventor and placed approximately 560Mb of CD-quality audio data in this directory, in the form of .wav files:
# ls *.wav track01.cdda.wav track05.cdda.wav track09.cdda.wav track13.cdda.wav track02.cdda.wav track06.cdda.wav track10.cdda.wav track03.cdda.wav track07.cdda.wav track11.cdda.wav track04.cdda.wav track08.cdda.wav track12.cdda.wav |
In order to have a good feel for the power of your cluster, it's a good idea to perform some number-crunching tests using only a single node in your cluster. Then, you'll be able to determine how good a job openMosix is doing by comparing one-node performance to multi-node performance. In that spirit, I decided to FLAC-encode all the audio data using *only* inventor. To do this, I typed the following shell mini-script at the bash prompt:
# for x in *.wav do flac -8 $x done |
This ad-hoc script tells bash to run "flac" to encode each audio track in succession. Using this script, no more than one process will run at a time. And because inventor happens to be my fastest node, the currently running encoding process runs locally. OpenMosix realizes that the process should be kept right where it is for maximum performance. Note that if I wanted to explicitly disable process migration for these processes, I would type "runhome flac -8 $x" on line 3 instead. After this mini-script completes, I had a bunch of ".flac" files in my current directory, alongside my ".wav" files.
Using a single node (inventor), the encoding of 560Mb of audio data completed in 13 minutes and 13 seconds.
Now that I had a baseline single-node performance number, I removed all the new *.flac files and decided to start all my FLAC encoding processes at once. By doing so, openMosix would certainly distribute these processes throughout my cluster (of 2 nodes!) so that each flac process could take advantage of my cluster's CPU horsepower. To start all flac processes in parallel, I typed:
# for x in *.wav do flac -8 $x & done |
Thanks to the addition of a "&" all the encoding tasks started immediately. Using "mmon," I watched the load on node 1 shoot up to around 14. However, after several seconds, the load on node 1 started dropping and node 2's load started rising—processes were migrating! Within 30 seconds or so, the CPU load was evenly distributed among my two nodes as FLAC chugged away. This time, my 560Mb of audio data was encoded in 8 minutes and 12 seconds—62% of my original time, and quite good considering that sidekick is significantly slower than inventor. In fact, 8 minutes and 12 seconds is very close to the ideal of a perfect linear scaling of performance (which would have resulted in a completion time of probably a bit under 8 minutes.) If you have more than two nodes in your cluster or even just two equally-powered nodes, then you can expect to see even greater performance improvements than the modest 62% of baseline performance improvement I experienced on my openMosix cluster.
I decided to take a look at exactly how much overhead was involved in running an encoding process away from its home node. To do this, I forced all encoding processes to run on sidekick by using the following script:
# for x in *.wav do mosrun -2 flac -8 $x & done |
The "mosrun -2" prefix causes the flac processes to run on node 2, which is sidekick. Since I executed this script on inventor, all the flac processes will be running remotely and will be relying on openMosix to ferry audio data back and forth over the network on their behalf. As expected, sidekick completed the encoding jobs more slowly than inventor, taking 19 minutes and 3 seconds to complete.
Next, I physically copied my ".wav" files to sidekick, installed flac on sidekick, and ran flac on sidekick with migration disabled (by using the "runhome" command). I did this so that I could compare the difference between running all the processes on sidekick via openMosix process migration as compared to running all the processes on sidekick the old-fashioned manual way.
Run directly on sidekick, the flac processes completed in 18 minutes and 19 seconds. Based on this and the previous timing data, I calculated that running flac remotely via openMosix incurs a 4% remote execution overhead—a very reasonable number! Note that I'm using full-duplex Fast Ethernet; if you're using Gigabit Ethernet, then your remote execution overhead will probably be significantly lower than mine.
As you can see, openMosix is a powerful solution for intelligently distributing work across a cluster of Linux machines. In best-case scenarios, openMosix scales almost linearly with the CPU horsepower of the cluster, and openMosix has a very low remote execution overhead to boot. I hope you've enjoyed this series; most of all, I hope that you enjoy the benefits of putting openMosix to work for you!
- Read the other installments in this series:
- "Advantages of OpenMosix on IBM xSeries, Part 1" (developerWorks, October 2002)
- "Advantages of OpenMosix on IBM xSeries, Part 2" (developerWorks, October 2002)
- Visit the home of the openMosix project on SourceForge.
- Check out Qlusters, Inc., the developers of next-generation openMosix functionality.
- Keep up with all the stuff the project leader of openMosix is doing at Moshe Bar's home page.
- Visit the Intel Developer Web site.
Daniel Robbins is the President/CEO of Gentoo Technologies, Inc. Residing in Albuquerque, New Mexico, he is the creator of Gentoo Linux, an advanced Linux for the PC, and the Portage system, a next-generation ports system for Linux. He writes articles, tutorials, and tips for the developerWorks Linux zone and has also served as a Contributing Author for the Macmillan books Caldera OpenLinux Unleashed, SuSE Linux Unleashed, and Samba Unleashed. Daniel has been involved with computers in some fashion since the second grade, when he was first exposed to the Logo programming language as well as a potentially dangerous dose of Pac Man. This probably explains why he has since served as a Lead Graphic Artist at SONY Electronic Publishing/Psygnosis. Daniel enjoys spending time with his wife, Mary, and daughter, Hadassah. You can contact him at drobbins@gentoo.org.




