Skip to main content

skip to main content

developerWorks  >  Linux | Open source  >

Remote computing with a Linux application server farm

Case study: Network booting, VNC, and SSL cooperate to yield big benefits

developerWorks
Document options
PDF format - A4

PDF - A4
1 KB

PDF format - letter

PDF - Letter
1 KB

Get Adobe® Reader®

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Advanced

Kyler Laird (Kyler@lairds.com), Research analyst, Student
Cameron Laird (claird@phaseit.net), Vice president, Phaseit Inc.

06 Feb 2007

You've heard of Web 2.0, right? Well, here's "utility computing 2.0," a combination of network booting, SSL, VNC, and other familiar concepts and technologies -- all on Linux® -- that can yield dramatic returns on investment. See how the University of California set up a server farm environment to provide secure remote desktop application services for students.

Save thousands of dollars. Significantly enhance usability. Strengthen security. Ease maintenance. Reduce dependence on single sources or expensive components. Simplify licensing.

We're pleased when a development project achieves even one of these advantages. When all of them come together -- well, it's what we call "utility computing 2.0."

These big payoffs come when we combine a bundle of individually mundane techniques; in isolation, no single one is particularly dramatic. Complicating the picture is that we use distinct, if somewhat overlapping, bundles for different projects. This article illustrates this bundling approach in a project using net-booting, low-power computing, VNC, and SSL.

Net-booting college computing lab

The School of Engineering of the University of California, Merced (UCM) maintains two 30-seat computing labs where students do homework. One lab had conventional Windows® XP desktop hosts, but during a recent replacement cycle, the lab switched to ITX Mini-Box M200s driving 24-inch Dell LCD displays set to 1280x1024. These seats cost only USD $1,138 each, even bought singly. Table 1 compares costs of the Mini-Box and a comparable conventional desktop system.


Table 1. Basic desktop costs
ItemMini-BoxConventional desktop
CPUUSD $275USD $1,100
24-inch monitorUSD $747USD $747
Mouse, keyboardUSD $36USD $36
1GB memoryUSD $80USD $180
Nominal power drain15 watts200 watts

The default workstation available for purchase for labs such as this costs almost twice as much: USD $2,063. Also, the savings of about 100 average watts represents, at prevailing California electricity prices, around USD $10 per month, or well over USD $100 per year. That advantage more than triples when you consider the reduced load on the air-conditioning facility for the labs. Another way to look at it: savings on the CPU make the big 24-inch displays feasible. These thin-but-wide clients are very popular with users.

On the other hand, a standard workstation is roughly three times as fast as the miserly little Mini-Box. That's okay, though, because the lab computers are plenty fast enough for the jobs they're called on to do: basic report preparation, e-mail, Web research, classroom development assignments, and display of remote results.

Here's how these hosts operate on a system level. Each seat is set for net-booting, from a single boot-server. All machines are essentially identical, booting with a built-in PXE loader in roughly 73 seconds to Ubuntu Linux 2.6. An individual workstation acquires read-only NFS-mounted working disk space during boot-up. As part of the boot process, UnionFS uses tmpfs to create a read-write root filesystem. Once users log in -- another 23 seconds -- they see their home filesystem in /home/username, a link to the mount point.



Back to top


Log in...and out, and all around

Once a seat finishes booting, it displays a conventional-looking login prompt. The getty isn't connected to /etc/passwd or LDAP, though, but to SSH! The /etc/inittab shared by each seat includes the line:


Listing 1. Boot-up launch of getty
1:2345:respawn:/sbin/rungetty -u root tty1 /usr/local/sbin/UCM_login
	

/usr/local/sbin/UCM_login itself is a three-hundred-line Python script that:

  • Launches all logins within screen, so they're remotely debuggable
  • Sets certain timeout, logging, terminal-driver, and related "housekeeping" configuration variables
  • Converses with a "status server," which monitors the labs as a whole
  • Prints a welcome and usage message
  • Prompts the user for a login

This last is unusual. As mentioned above, the login isn't a standard one authenticating with /etc/passwd or even PAM. Instead, it has the intelligence to handle such accounts as:

  • guest@localhost, an account at the seat itself. For many tasks, of course, simply to launch a Web browser is enough. This account demands no password, and receives no persistent storage.

  • Any account in the UCM LDAP store, like klaird

  • Any other ssh account, anywhere in the connected world. This means that a visitor to the campus can connect to his usual home directory with a login like someone@yale.edu.

The design of these labs and workstations makes it safe and even natural to give all these logins, including even guest@localhost, privileges to sudo, access audio, and otherwise use all the M200's capabilities.

Campus-wide logins, like klaird, are actually not resolved directly by LDAP, but sent by way of ssh to a dedicated machine on the LAN, which itself knows how to authenticate requests against LDAP and automatically create corresponding accounts. This has simplified the configuration of the lab seats and maintenance of the dedicated ssh-LDAP host.

There's no harm in allowing -- even encouraging -- root logins, because there's no local storage, and each seat can be quickly rebooted back to its default state. Security (beyond the scope of this article) limits the potential damage done by a hooligan with root at an individual seat.

The result is that students can walk into the lab at any time and access their usual $HOME. Or, they can treat the lab seat as a stand-alone box for installation of specialized engineering software. Or, the lab seat can simply be a highly capable display for viewing remote scientific computations, a collaboration hub for group projects, or a stand-alone access point for connecting to the Web. Or, if you need an application that is available only under Windows, VNC served by a couple of high-performing Windows XP boxes gives plenty of access to that operating system for those who need it.

Notice, by the way, how SSL-for-VNC leverages these other elements. As explained in our previous article, "SSL secures VNC applications," you can view X11-based displays, among others, remotely and securely through a Web browser. This technique further reduces the learning curve or "activation energy" for a student or professor who wants to share a computation, even for a one-time-only or limited-use URL.



Back to top


Advantages

In any of the roles listed above, all sessions and displays are accessible remotely with screen and VNC, so technical support can act swiftly and insightfully. In the rare event a host breaks, or the more likely one that there's a need to scale up, it takes seconds to pull out another Mini-Box, attach power, keyboard, video, mouse, and network connections, and boot it, ready for action. The benefits are so evident, and the costs so low, that other facilities on campus have begun to ask for their own inventories of M200s.

The uniformity and utility of this variation on thin-client computing has liberated attention for less common benefits. With each seat serving screen and VNC, technical support is both simplified and physically decoupled. It takes fewer assistants to maintain operations, and many times these assistants need not even be in the lab.

As it becomes comfortable to move logins, displays, processes, and desktops around, instructors and teaching assistants find it advantageous to check in on student work "live." Two workstations have been dedicated to big-screen projection and other multimedia playback, with two more scheduled for the future. It's easy to quickly set up shared displays on nearby workstations, or send them to the projector in the lab or a classroom. A student can develop a presentation in the lab, then initiate it interactively for the whole class. And, because VNC clients are themselves ubiquitous, it's a simple matter for a scholar upstairs or across the country to make a demonstration available for those working in the labs.

Most of these features take only a few lines of coding and configuration, which are perhaps unimpressive in isolation. This project, for instance, created a standard way to securely forward a display. First is a single-line shell script:


Listing 2. Send display to another seat
socat TCP4:localhost:5900 "EXEC:ssh guest@projector -i /usr/local/etc/display_id_dsa"
	

The ~guest/.ssh/authorized_keys holds:


Listing 3. Keys for reception of a a VNC projection
command="exec bin/remote_display",no-port-forwarding,no-X11-forwarding,no-agent-forwarding
   /ssh-dss AAAAB3NzaC1kc3MAAAEBAKhMDgnFAgYBh4Xega...
	

~guest/bin/remote_display is a file with its execution bit set, and containing:


Listing 4. Standard VNC viewer launching
#!/bin/sh
export DISPLAY=:0
export XAUTHORITY=/tmp/Xauthority
   /usr/src/vnc_unixsrc/vncviewer/vncviewer -shared STDIO:0
	

As simple as this little mechanism is, we see a number of installations that stumble along without ever settling such matters. In these too-common places, users are largely left to their own devices to configure remote displays -- then redo the same work the next time there's a need. Notice, by the way, that the example immediately above works through NAT. This illustrates the principle that careful use of standard components, including VNC, socat, and so on, allows for standard solutions that apply in the situations that arise in the UCM labs.

What's next? There's plenty of work left to do. The facilities for session recording remain incomplete. We're exploring ways to accelerate display of remote sites, and building tools to simplify native use of distributed resources. Security is, as always, a never-ending challenge. Quite a few small tuning opportunities remain: further reductions of boot-time, use of panning or scaling on the projectors to match the potential 1900x1200 resolution afforded by the workstation monitors, acceleration or replacement of VNC with NX or other technologies, integration of VoIP functions, innovations in collaborative processes, and automation of outbound VPN to peer sites are a few of the projects we might tackle.



Back to top


Larger consequences for science and engineering

Most exciting of all are two qualitative distinctions that are more academic than technical. First, usability helps bring students' focus back to content, rather than computing technique. A simple, uniform, and flexible computing environment eliminates the former premium on memorization of esoteric command-line or "wizard" sequences. Students can concentrate on their work, without such distractions as malware-infected hosts, broken parts, authentication or synchronization misconfigurations, and all the ills now evoked by the jargon word "silo": value locked up in a particular machine or process, unable to communicate across boundaries. Emphasis on simple, reliable approaches makes for a significantly different end-user experience.

Furthermore, appropriate commoditization of computing environments and thoughtful reliance on open source products make it easier for scientists and engineers to calculate the results they're after in ways that others can reproduce and monitor. At UCM and elsewhere, we're starting to see such payoffs: scientists who think it natural to share and publish not just their theories, commentaries, or data, but also the computing processes and displays that yield those results.

Many of the same principles apply, of course, to more conventional office automation environments. Think, for example, of a national-scale insurance company or bank. Of course it needs to keep most of its data strictly confidential, and has no interest in publishing them on the Internet at large. However, use of simple, standard components promotes a couple of major goals that most organizations share:

  • Ease-of-use reduces training time and eliminates critical paths. When a desktop farm like the one described here supports agents or front-line workers, it becomes easier for employees to move work among their desks. New employees become productive more quickly, and, when a power supply goes bad or an employee takes a vacation, it's much easier to swap in a new computer or co-worker.

  • Computing transparency promotes the kind of auditability and security that Sarbanes-Oxley and other business laws have mandated for companies. Simple processes are easier to understand, and harder to defraud.

Whether in commercial or academic settings, then, "utility computing 2.0" has unexpected potential to make computers more responsive and convenient servants. Developers who see the techniques we deploy consistently find them almost trivial -- yet, in the right combinations, they make for large payoffs in usability, security, and reliability.



Back to top


Disclaimer

Share this...

digg Digg this story
del.icio.us Post to del.icio.us
Slashdot Slashdot it!

Several details of our own work are not yet available for open publication, and in a few cases above we've streamlined our explanation of tangential details. We presented, for instance, the hosts as standard Mini-Box units; in fact, each one has a single standard BIOS reconfiguration to enable netbooting. As we continue to improve, though, our techniques should become more and more transparent.

Our special thanks to German Gavilan, Assistant Dean of the School of Engineering of UCM.



Resources

Learn
  • Recent publicity on power consumption focuses on the server room, where requirements are much different from common desktop use cases.

  • Recent press on extra-large monitors like the 24-inch ones in the UCM labs extol the advantages.

  • The rdp2vnc gateway between proprietary RDP and VNC is maintained as part of the LibVNCServer project hosted at SourceForge.

  • "Remote Network Boot via PXE" is a brief introduction to PXE and related topics.

  • tmpfs is a file system based on virtual memory.

  • Unionfs is a file system that logically merges and manages physically-separate mass storage, and which provides for "union mounts." In particular, it allows for fine control of read-only and read-write segments. See also the Wikipedia article on Unionfs.

  • NX uses differential compression and proxy service to accelerate VNC service and adapt it to domains where VNC has been a poor fit in the past.

  • So-called "portable applications" constitute another approach to simplicity and standardization that you might want to evaluate for your utility computing toolbox.

  • "An Introduction to X11 User Interfaces", which Grant Edwards wrote over a decade ago, remains pertinent and almost entirely accurate. Our remoting technique works entirely at the VNC level, and doesn't depend on X in any direct way; however, X is the basis for several competing approaches to remoting, so it's helpful to understand it clearly. (The X Window System is the official name of the graphical toolkit that underlies nearly all Linux and UNIX® desktops.)

  • "Connect securely with ssh" (developerWorks, July 2003) reviews the possibilities for remote connection. Ssh is a flexible protocol often used to build "secure tunnels." VNC tunnelled through ssh is probably the single most common remoting technology for Linux.

  • In the developerWorks Linux zone, find more resources for Linux developers.

  • Stay current with developerWorks technical events and Webcasts.

Get products and technologies
  • Order the SEK for Linux, a two-DVD set containing the latest IBM trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

  • With IBM trial software, available for download directly from developerWorks, build your next development project on Linux.


Discuss


About the authors

Kyler Laird is a research analyst for the University of California at Merced, among other roles. His volunteer activities have resulted in several public resources, including ones focused on general aviation and animal rescue.


Cameron Laird is a long-time developerWorks contributor and former columnist. He often writes about the open source projects that accelerate development of his employer's applications, focused on reliability and security.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top


Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. UNIX is a registered trademark of The Open Group in the United States and other countries. DB2, Lotus, Rational, Tivoli, and WebSphere are trademarks of IBM Corporation in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.