The previous articles introduced the basic programming techniques and Kuro Box-specific installation details required to build an applet that communicates with a remote user through the Kuro Box's Web server, using the CGI interface.
Now you'll put all this knowledge to work by building a Web-based application to browse and manage images on the Kuro Box's hard disk. The relevance of this to autonomous underwater vehicles will become apparent in the next article, but for the time being, just think "Webcam."
Before embarking on this journey, I need to stress the following point: We're talking about embedded applications here. This fact underlies and colors every aspect of the software designs I'm discussing. If you've got an IT background -- if, for example, you're a professional Web developer -- you'll doubtless find yourself thinking that I'm using unnecessarily primitive technology to achieve my goals. Bear with me. Using well-understood, lowest-common-denominator techniques is an important part of conservative embedded engineering. Although the Kuro Box is capable of running much more advanced solutions, there's no good reason to occupy precious system resources with Web interface sugar.
I'll begin by defining the desired feature specifications of the Web interface this article teaches you to build. It's a pretty simple application:
- The user will be required to log in with a username and password before accessing any files or options.
- After login, the user should be able to view a directory of recognized image files on the Kuro Box hard disk and browse this directory in a scrollable pane. A static area in the browser window is reserved for status info and global commands and will be implemented in a future version of the application.
- After selecting an image, the user should be able to view a thumbnail with some general information and zoom in on the full-size picture if desired.
Now, a concept that might be rather foreign to you if you haven't developed Web applications before is the inherent statelessness of HTTP transactions. Essentially, each request and response conversation between the client (Web browser) and server (Kuro Box) has to stand alone. There is no implicit continuity between one transaction and the next.
Most user interfaces you interact with are highly stateful. If you'll bear with me through an admittedly rather forced analogy, consider your car's ignition switch. Your car has only one switch, its possible states follow a known order, and the state of this switch is directly linked to the state of the car's electronics. It would be a surprising thing (though not totally unthinkable in this century of Bluetooth ignition keys) if your neighbor turning the key in his car's ignition were to start your car's engine. Furthermore, if you turn the key to "start," your car can safely assume that the switch was previously in the "run" position, because there is no way of going from "accessory" to "start" without first traversing the "run" state. (This is a fairly important fact in the case of the car, by the way, because when you go to "run," the fuel pump starts to spin, in order to pressurize the fuel line. I'm sure you can think of a number of other common user interface actions in everyday life where the previous state of a control is just as important as its new state).
A Web-based interface doesn't work like this at all. To continue the ignition switch analogy, a Web interface presents a potentially infinite number of ignition switches to the outside world, and none of them are inherently uniquely identified. When someone operates a switch, all that the Web server is told is that some switch, somewhere, changed into a given state. The previous state of the switch is unknown, too.
Some of the techniques for managing this problem are quite sophisticated and complex. The simplest and least resource-intensive method is to design your Web application so that it doesn't require any state information to function properly. An example of this would be a slide show program where the only controls available to the user are "next slide," "previous slide," and "start over." In a system like this, all the information the server needs in order to generate a response to a button press is implicit in the current page location and the identity of the link or button the user chose to click.
This method is convenient from the viewpoint of programming simplicity, and it's almost universally employed in embedded appliances (NAS appliances like Kuro Box, for instance, and many other little network-connected devices like smart Ethernet cameras, telemetry and logging appliances, WiFi access points, switches and routers). These devices are normally designed to be administered by only a single person at a time, and they generally have only one, unrestricted access level (though it is possible to bury access control data in the session by directing different logins to different sets of CGI scripts, for example). In this article, I use a method that most closely matches this category. Relevant advantages of this system include:
- The server doesn't need to maintain any per-session information. This might not seem earth-shatteringly important on the Kuro Box, which will normally live behind a firewall and which has a nice, big hard disk, but in general you don't want to be storing volatile session data on an embedded appliance unless you have a REALLY good reason. All it takes is one attack and your disk, flash, or RAM will fill up with stale sessions.
- As a corollary to the above, there is no need for server-side garbage collection (cleaning up stale session information) and equally no chance that a client connecting to the device will see someone else's old session data.
However, the technique above isn't always usable, so you should be aware of some other options. A more flexible way to deal with the limitations of vanilla HTTP is to use the Web browser itself to keep track of the session state. I'll classify the two broad ways you can do this as either the client-centric or the server-centric method.
Client-centric state management
In a client-centric model, the session data is stored on the client (browser) computer, and the entire state, or a relevant subset of it, is communicated to the server during each form submission or other action. This is commonly accomplished by using browser cookies to store data on the client computer and having some JavaScript to communicate cookie data back to the server when necessary, but if you recall from the previous article, I don't approve of unnecessary active scripting in embedded appliances.
Another way of doing the same thing is by "reflecting" the data through the server and keeping it hidden in an invisible section of every page returned to the client. For example, if you were accessing a security appliance with four cameras on it, you might start at a page that lets you select which camera to use, and then you'd go to a response page with invisible FORM data in it that would let the server know which camera you were working with. Every time you clicked to perform some action on this response page (say, "pan left"), you'd be submitting that hidden field along with everything else on the page -- so the server would actually see a command like "pan left, camera 4."
Both of these can work, but they're not entirely secure. Server-centric systems give you more control over security.
Server-centric state management
You can achieve server-centric session management in several ways, but probably the easiest is to keep track of sessions by the IP address of the client. This system works moderately well in very small network situations, particularly when you can be sure that everyone who accesses your interface can in fact be uniquely identified by this method. Many consumer devices, such as small routers, use this system as a "gate" to make sure that only one person at a time tries to alter the device's configuration. The way it is generally implemented is that once you log in successfully, the device binds itself to your IP address and starts an internal countdown timer of a few minutes duration. The timer resets to its default value every time you interact with the Web server. If anyone else tries to access the device before the inactivity timer expires, they'll see a "this device is being administered by a.b.c.d" sort of error message, where a.b.c.d is the IP address of the successfully logged-in user.
You shouldn't use this sort of interface on a device that expects to be connected to the real Internet, for a variety of reasons -- but the main reason is simply that an IP address alone is insufficient information to identify a computer on the Internet uniquely these days. For example, every computer behind a router that implements NAT (that is, practically every el-cheapo consumer router sold) will appear to have the IP address of the router itself. This opens a nasty security hole -- if your widget is on the Internet somewhere, and you log into it from work (behind a NAT box), anyone else on your work LAN would be able to access that widget during the inactivity timeout window without needing a password.
A more powerful, and certainly more reliable server-centric technique is to have the session state (whatever that might comprise for your particular application) stored in RAM or on disk server-side, and reference it by means of an otherwise meaningless token (typically a large random number) that's assigned whenever a user logs on. The token can be buried in the server responses and reflected back and forth using hidden form tags, as described above. Session data would normally be cleaned up by an explicit logout request from the user, or an inactivity timeout.
The crucial difference between this method and the client-side reflection system described above is that an attacker cannot know what privileges (if any) might belong to a given token at any moment -- so he or she can't make up tokens to spoof transactions to appear as if they come from authorized users.
Now for the particulars of the implementation: Beginning with
the first design goal, how can you perform username and password
authentication and ensure that unauthorized users can't log in? The
easiest method is simply to use the Web server's built-in directory
protection mechanism. This is the way Buffalo chose to implement
password protection on the Kuro Box Web-based administration pages. If
you look in the /www and /www/cgi-bin directories (assuming you still
have the standard Buffalo pages on your hard disk), you'll find a file
called .htpasswd. This is an authentication username and password database
in the same general format as the standard /etc/passwd file we all
know and love.
You can generate a new username and password pair using the htpasswd
utility. You can view quite comprehensive on-line help by simply
running htpasswd with no parameters. You'll need to generate two
.htpasswd files, each containing the same username and password pair.
Supposing you want the username to be "kuro" with password "password,"
you would use the following commands:
Listing 1. Creating the .htpasswd files
htpasswd -cb /www/.htpasswd kuro password
htpasswd -cb /www/cgi-bin/.htpasswd kuro password
|
The reason you need to have password control files in both locations is because otherwise someone who knows (or guesses) your file system structure might bypass the protection at the top level by calling the CGI scripts directly.
Note: The default behavior of thttpd is to look for .htpasswd at the
same directory level as the file being requested. This allows you to
set up different username and password pairs for different directories. If
you just want a single global password file for all directories of
your Web server, use the -g parameter when launching thttpd.
Now you need to build a tiny bit of HTML infrastructure to format the output of your CGI script. I want to break up the browser output into a standard three-pane layout: a bar across the top for status and global commands, and in the remaining area, a vertical, scrollable directory list and a larger area on the right for information about a file selected in the left-hand pane.
You can easily accomplish this with a little frame code:
Listing 2. A little frame code
<html>
<frameset rows="64,99%">
<frame noresize="noresize" scrolling="no" src="cgi-bin/album.cgi?status" name="status">
<frameset cols="15%,85%">
<frame noresize="noresize" scrolling="yes" src="cgi-bin/album.cgi?dir" name="dir">
<frame src="blank.htm" name="file">
</frameset>
</frameset>
</html>
|
(You'll find this frameset, the index.htm file, and the
blank.htm file it
references, in the article5/html directory after
unpacking the source
tarball in the Download section ). If you move those two files into
/www, you should be able to log in and see the
outline of the frames --
at the same time, you can test that the password you configured
earlier is being recognized properly.
By the way, the reason I specified the frame dimensions in the way you see above (a hybrid of fixed pixel size and window percentages) is because you'd usually want to put some nice graphics in the status bar; I arbitrarily gave you 64 pixels to play with because that seems to be a nice size for icons. Because I'm no artist, I've stuck with regular text links.
Now you can build the CGI binary, album.cgi, and install it by moving
it into /www/cgi-bin. I won't inline the source code here, but I'll go
over the highlights of how it operates (and why).
The album.cgi program performs several different tasks according to the first
argument passed to it on the command line (that is, the first word
appearing after the question mark in a query-type URL such as
http://192.168.0.7/cgi-bin/album.cgi?query).
Those tasks are as follows:
If the query is "status," album.cgi will generate the HTML code for
the status or action bar intended for display in the top pane of the
frameset described above. At the moment, this bar only contains two
links -- "home," which refreshes the directory pane, and "clear,"
which
resets the main pane to a blank default page.
If the query is "dir," the CGI program will display a directory of
the files within the directory /www/images
whose names contain .jpg or .bmp (note that this is case-sensitive). The
directory listing will be HTML-encapsulated so that clicking any file name
will open an info query (see below) on that particular file in the main
pane of the enclosing frameset. Please note that the enumeration code
is quite primitive -- for example, it doesn't let you browse into
subdirectories. This functionality could quite easily be added by allowing
the "dir" query to take a second parameter, telling it which directory
to read.
Finally, if the query is "info," album.cgi expects the next
command-line parameter to be the file name. (For security reasons,
album.cgi won't accept file names that contain a forward slash
character). It prepares a little info page describing the image's
physical path on disk and size in bytes, with a thumbnail included for
good measure. Clicking the thumbnail opens the full-size image in a new
window.
The way I present thumbnail images is a calculated hack, by the way -- I just use the browser to scale the displayed copy in a Procrustean fashion at a forced size of 320x240. In a real photo browser application, you'd probably want to reduce network traffic by storing prescaled thumbnails on the hard disk and transferring those smaller versions. However, that could require quite a lot of back-end processing power, not to mention storage space. I happen to know that this particular application is never normally going to work with images larger than 352x240 pixels in size, so this seemed like a reasonable limitation.
Now would be a good time to dump a few BMP or JPEG images into the
/www/images directory (remember to lowercase the extensions on those
file names) and experiment with the CGI script. Congratulations -- you've
now got a "real," if rather primitive, embedded application running
on
the Web server!
The next article builds heavily on this image browser application to demonstrate some image capture and analysis techniques using an off-the-shelf USB Webcam based on the STV0680 camera chip. The code in that article comes more or less directly from the E-2 submarine project that started this series, with the only major difference being that E-2 doesn't normally use a Web server. It will examine camera-based motion detection and image edge location, the basis of machine vision applications.
For those of you who are chomping at the bit to start building hardware, the seventh article in the series will introduce the AVR-based hardware platform that will connect your Kuro Box to sensors, motors, and other goodies. As always, comments and suggestions are welcome in the forum.
| Description | Name | Size | Download method |
|---|---|---|---|
| Source code | pa-migrate5code.tar.gz | 2.2KB | HTTP |
Information about download methods
- Participate in the discussion forum.
- Familiar with embedded development, but not with PowerPC? Familiar with
PPC, but not embedded? Or familiar with both of these (or neither), but
not with Linux? Migrating from x86 to PowerPC offers clear, valuable,
hands-on advice to the beginner and professional programmer alike -- and
is the only series on the Internet that will help you build your own
remote-controlled robot submarine. Check out the other articles in this series.
- Need to
hack a serial
port onto your Kuro box? Lewin has posted all of the details to
his site.
- This is a good tutorial on frame code, along with links to useful interactive demo pages.
- Frames are not right for every application, but good
use of frames can help a lot.
- Find more information on HTTP authentication in RFC2617.
- Read this page on CGI script security
before working on CGI scripts of the type implemented here
(particularly look at Q6.2 and pay attention to the comments in my
source code!).
- For information on what NAT is and how it works, read this page explaining
why IP-address-level
user identification isn't unique enough to be useful on the real Internet.
-
The
Cranky User doesn't like JavaScript dependencies much either (developerWorks, March 2001).
- The Power Architecture Community Newsletter includes full-length articles as well as recent news about members of the Power Architecture community and upcoming events of interest.
Learn more
about the Power Architecture Community Newsletter and how to contribute to it. Subscription is free.
Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com.





