Skip to main content

Migrating from x86 to PowerPC, Part 5: Create a Kuro-based Web album

... it's much smaller than a slide projector

Lewin Edwards (sysadm@zws.com), Author, Freelance
Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com.

Summary:  In the fifth article of the Migrating from x86 to PowerPC series, Lewin Edwards shows how to get a photo album running on the Kuro Box. In the process, he covers embedded systems design goals, Web server security, and shows off a few handy tricks for off-loading processing work to the client system.

View more content in this series

Date:  17 May 2005
Level:  Introductory
Activity:  920 views
Comments:  

The previous articles introduced the basic programming techniques and Kuro Box-specific installation details required to build an applet that communicates with a remote user through the Kuro Box's Web server, using the CGI interface.

Now you'll put all this knowledge to work by building a Web-based application to browse and manage images on the Kuro Box's hard disk. The relevance of this to autonomous underwater vehicles will become apparent in the next article, but for the time being, just think "Webcam."

Before embarking on this journey, I need to stress the following point: We're talking about embedded applications here. This fact underlies and colors every aspect of the software designs I'm discussing. If you've got an IT background -- if, for example, you're a professional Web developer -- you'll doubtless find yourself thinking that I'm using unnecessarily primitive technology to achieve my goals. Bear with me. Using well-understood, lowest-common-denominator techniques is an important part of conservative embedded engineering. Although the Kuro Box is capable of running much more advanced solutions, there's no good reason to occupy precious system resources with Web interface sugar.

I'll begin by defining the desired feature specifications of the Web interface this article teaches you to build. It's a pretty simple application:

  1. The user will be required to log in with a username and password before accessing any files or options.
  2. After login, the user should be able to view a directory of recognized image files on the Kuro Box hard disk and browse this directory in a scrollable pane. A static area in the browser window is reserved for status info and global commands and will be implemented in a future version of the application.
  3. After selecting an image, the user should be able to view a thumbnail with some general information and zoom in on the full-size picture if desired.

Transactions and states

Now, a concept that might be rather foreign to you if you haven't developed Web applications before is the inherent statelessness of HTTP transactions. Essentially, each request and response conversation between the client (Web browser) and server (Kuro Box) has to stand alone. There is no implicit continuity between one transaction and the next.

Most user interfaces you interact with are highly stateful. If you'll bear with me through an admittedly rather forced analogy, consider your car's ignition switch. Your car has only one switch, its possible states follow a known order, and the state of this switch is directly linked to the state of the car's electronics. It would be a surprising thing (though not totally unthinkable in this century of Bluetooth ignition keys) if your neighbor turning the key in his car's ignition were to start your car's engine. Furthermore, if you turn the key to "start," your car can safely assume that the switch was previously in the "run" position, because there is no way of going from "accessory" to "start" without first traversing the "run" state. (This is a fairly important fact in the case of the car, by the way, because when you go to "run," the fuel pump starts to spin, in order to pressurize the fuel line. I'm sure you can think of a number of other common user interface actions in everyday life where the previous state of a control is just as important as its new state).

A Web-based interface doesn't work like this at all. To continue the ignition switch analogy, a Web interface presents a potentially infinite number of ignition switches to the outside world, and none of them are inherently uniquely identified. When someone operates a switch, all that the Web server is told is that some switch, somewhere, changed into a given state. The previous state of the switch is unknown, too.


Managing state

Some of the techniques for managing this problem are quite sophisticated and complex. The simplest and least resource-intensive method is to design your Web application so that it doesn't require any state information to function properly. An example of this would be a slide show program where the only controls available to the user are "next slide," "previous slide," and "start over." In a system like this, all the information the server needs in order to generate a response to a button press is implicit in the current page location and the identity of the link or button the user chose to click.

This method is convenient from the viewpoint of programming simplicity, and it's almost universally employed in embedded appliances (NAS appliances like Kuro Box, for instance, and many other little network-connected devices like smart Ethernet cameras, telemetry and logging appliances, WiFi access points, switches and routers). These devices are normally designed to be administered by only a single person at a time, and they generally have only one, unrestricted access level (though it is possible to bury access control data in the session by directing different logins to different sets of CGI scripts, for example). In this article, I use a method that most closely matches this category. Relevant advantages of this system include:

  • The server doesn't need to maintain any per-session information. This might not seem earth-shatteringly important on the Kuro Box, which will normally live behind a firewall and which has a nice, big hard disk, but in general you don't want to be storing volatile session data on an embedded appliance unless you have a REALLY good reason. All it takes is one attack and your disk, flash, or RAM will fill up with stale sessions.
  • As a corollary to the above, there is no need for server-side garbage collection (cleaning up stale session information) and equally no chance that a client connecting to the device will see someone else's old session data.

However, the technique above isn't always usable, so you should be aware of some other options. A more flexible way to deal with the limitations of vanilla HTTP is to use the Web browser itself to keep track of the session state. I'll classify the two broad ways you can do this as either the client-centric or the server-centric method.

Client-centric state management

In a client-centric model, the session data is stored on the client (browser) computer, and the entire state, or a relevant subset of it, is communicated to the server during each form submission or other action. This is commonly accomplished by using browser cookies to store data on the client computer and having some JavaScript to communicate cookie data back to the server when necessary, but if you recall from the previous article, I don't approve of unnecessary active scripting in embedded appliances.

Another way of doing the same thing is by "reflecting" the data through the server and keeping it hidden in an invisible section of every page returned to the client. For example, if you were accessing a security appliance with four cameras on it, you might start at a page that lets you select which camera to use, and then you'd go to a response page with invisible FORM data in it that would let the server know which camera you were working with. Every time you clicked to perform some action on this response page (say, "pan left"), you'd be submitting that hidden field along with everything else on the page -- so the server would actually see a command like "pan left, camera 4."

Both of these can work, but they're not entirely secure. Server-centric systems give you more control over security.

The danger of client-centric state

Client-centric state management is generally fairly painless for the developer, but it's not always safe.

Both of the methods described suffer from a nasty security flaw: as described, they can't authenticate the session data the client supplies. Anyone who can log in to your device and set up the correct session data in his or her browser can fake a transaction to do practically anything on your widget. As an amusing example of this, some time ago Microsoft® had an Active Server page on its Web site that would take an appropriately formatted plain text tech note and add all the right HTML formatting to make it match the rest of microsoft.com. Someone published a fake tech note lambasting Windows and generated a URL that used the aforementioned script to make this note look like an official Microsoft statement. The cream of this particular prank was that since the script resided on microsoft.com, the URL appeared to be a totally legitimate statement on Microsoft's own Web site.

Server-centric state management

You can achieve server-centric session management in several ways, but probably the easiest is to keep track of sessions by the IP address of the client. This system works moderately well in very small network situations, particularly when you can be sure that everyone who accesses your interface can in fact be uniquely identified by this method. Many consumer devices, such as small routers, use this system as a "gate" to make sure that only one person at a time tries to alter the device's configuration. The way it is generally implemented is that once you log in successfully, the device binds itself to your IP address and starts an internal countdown timer of a few minutes duration. The timer resets to its default value every time you interact with the Web server. If anyone else tries to access the device before the inactivity timer expires, they'll see a "this device is being administered by a.b.c.d" sort of error message, where a.b.c.d is the IP address of the successfully logged-in user.

You shouldn't use this sort of interface on a device that expects to be connected to the real Internet, for a variety of reasons -- but the main reason is simply that an IP address alone is insufficient information to identify a computer on the Internet uniquely these days. For example, every computer behind a router that implements NAT (that is, practically every el-cheapo consumer router sold) will appear to have the IP address of the router itself. This opens a nasty security hole -- if your widget is on the Internet somewhere, and you log into it from work (behind a NAT box), anyone else on your work LAN would be able to access that widget during the inactivity timeout window without needing a password.

A more powerful, and certainly more reliable server-centric technique is to have the session state (whatever that might comprise for your particular application) stored in RAM or on disk server-side, and reference it by means of an otherwise meaningless token (typically a large random number) that's assigned whenever a user logs on. The token can be buried in the server responses and reflected back and forth using hidden form tags, as described above. Session data would normally be cleaned up by an explicit logout request from the user, or an inactivity timeout.

The crucial difference between this method and the client-side reflection system described above is that an attacker cannot know what privileges (if any) might belong to a given token at any moment -- so he or she can't make up tokens to spoof transactions to appear as if they come from authorized users.


Implementation

Now for the particulars of the implementation: Beginning with the first design goal, how can you perform username and password authentication and ensure that unauthorized users can't log in? The easiest method is simply to use the Web server's built-in directory protection mechanism. This is the way Buffalo chose to implement password protection on the Kuro Box Web-based administration pages. If you look in the /www and /www/cgi-bin directories (assuming you still have the standard Buffalo pages on your hard disk), you'll find a file called .htpasswd. This is an authentication username and password database in the same general format as the standard /etc/passwd file we all know and love.

You can generate a new username and password pair using the htpasswd utility. You can view quite comprehensive on-line help by simply running htpasswd with no parameters. You'll need to generate two .htpasswd files, each containing the same username and password pair. Supposing you want the username to be "kuro" with password "password," you would use the following commands:


Listing 1. Creating the .htpasswd files
                
htpasswd -cb /www/.htpasswd kuro password
htpasswd -cb /www/cgi-bin/.htpasswd kuro password

The reason you need to have password control files in both locations is because otherwise someone who knows (or guesses) your file system structure might bypass the protection at the top level by calling the CGI scripts directly.

Note: The default behavior of thttpd is to look for .htpasswd at the same directory level as the file being requested. This allows you to set up different username and password pairs for different directories. If you just want a single global password file for all directories of your Web server, use the -g parameter when launching thttpd.

Now you need to build a tiny bit of HTML infrastructure to format the output of your CGI script. I want to break up the browser output into a standard three-pane layout: a bar across the top for status and global commands, and in the remaining area, a vertical, scrollable directory list and a larger area on the right for information about a file selected in the left-hand pane.

You can easily accomplish this with a little frame code:


Listing 2. A little frame code
                
<html>

<frameset rows="64,99%">

<frame noresize="noresize" scrolling="no" src="cgi-bin/album.cgi?status" name="status">

<frameset cols="15%,85%">

<frame noresize="noresize" scrolling="yes" src="cgi-bin/album.cgi?dir" name="dir">

<frame src="blank.htm" name="file">

</frameset>

</frameset>

</html>

(You'll find this frameset, the index.htm file, and the blank.htm file it references, in the article5/html directory after unpacking the source tarball in the Download section ). If you move those two files into /www, you should be able to log in and see the outline of the frames -- at the same time, you can test that the password you configured earlier is being recognized properly.

By the way, the reason I specified the frame dimensions in the way you see above (a hybrid of fixed pixel size and window percentages) is because you'd usually want to put some nice graphics in the status bar; I arbitrarily gave you 64 pixels to play with because that seems to be a nice size for icons. Because I'm no artist, I've stuck with regular text links.

Now you can build the CGI binary, album.cgi, and install it by moving it into /www/cgi-bin. I won't inline the source code here, but I'll go over the highlights of how it operates (and why).


How album.cgi works

The album.cgi program performs several different tasks according to the first argument passed to it on the command line (that is, the first word appearing after the question mark in a query-type URL such as http://192.168.0.7/cgi-bin/album.cgi?query). Those tasks are as follows:

If the query is "status," album.cgi will generate the HTML code for the status or action bar intended for display in the top pane of the frameset described above. At the moment, this bar only contains two links -- "home," which refreshes the directory pane, and "clear," which resets the main pane to a blank default page.

If the query is "dir," the CGI program will display a directory of the files within the directory /www/images whose names contain .jpg or .bmp (note that this is case-sensitive). The directory listing will be HTML-encapsulated so that clicking any file name will open an info query (see below) on that particular file in the main pane of the enclosing frameset. Please note that the enumeration code is quite primitive -- for example, it doesn't let you browse into subdirectories. This functionality could quite easily be added by allowing the "dir" query to take a second parameter, telling it which directory to read.

Finally, if the query is "info," album.cgi expects the next command-line parameter to be the file name. (For security reasons, album.cgi won't accept file names that contain a forward slash character). It prepares a little info page describing the image's physical path on disk and size in bytes, with a thumbnail included for good measure. Clicking the thumbnail opens the full-size image in a new window.

The way I present thumbnail images is a calculated hack, by the way -- I just use the browser to scale the displayed copy in a Procrustean fashion at a forced size of 320x240. In a real photo browser application, you'd probably want to reduce network traffic by storing prescaled thumbnails on the hard disk and transferring those smaller versions. However, that could require quite a lot of back-end processing power, not to mention storage space. I happen to know that this particular application is never normally going to work with images larger than 352x240 pixels in size, so this seemed like a reasonable limitation.

Now would be a good time to dump a few BMP or JPEG images into the /www/images directory (remember to lowercase the extensions on those file names) and experiment with the CGI script. Congratulations -- you've now got a "real," if rather primitive, embedded application running on the Web server!


Looking ahead

The next article builds heavily on this image browser application to demonstrate some image capture and analysis techniques using an off-the-shelf USB Webcam based on the STV0680 camera chip. The code in that article comes more or less directly from the E-2 submarine project that started this series, with the only major difference being that E-2 doesn't normally use a Web server. It will examine camera-based motion detection and image edge location, the basis of machine vision applications.

For those of you who are chomping at the bit to start building hardware, the seventh article in the series will introduce the AVR-based hardware platform that will connect your Kuro Box to sensors, motors, and other goodies. As always, comments and suggestions are welcome in the forum.



Download

DescriptionNameSizeDownload method
Source codepa-migrate5code.tar.gz2.2KB HTTP

Information about download methods


Resources

About the author

Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com.

Comments



Trademarks

static.content.url=/developerworks/js/artrating/
SITE_ID=1
Zone=Multicore acceleration
ArticleID=83437
ArticleTitle=Migrating from x86 to PowerPC, Part 5: Create a Kuro-based Web album
publish-date=05172005
author1-email=sysadm@zws.com
author1-email-cc=