Skip to main content

skip to main content

developerWorks  >  Power Architecture technology | Linux  >

Migrating from x86 to PowerPC, Part 4: Build a minimal embedded Web interface

Robots and networked appliances on a shoestring

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Introductory

Lewin Edwards (sysadm@zws.com), Author, Freelance

05 Apr 2005

This installment shows you how to use small-footprint, highly portable, Free Software tools to Web-enable your unmanned submarine, in anticipation of browsing its onboard photo library from an underground lair in the next episode.

In this episode of "Migrating from x86 to PowerPC," you develop a very simple embedded Web interface, which you'll build on in the next couple of articles once you start communicating real-world data from the Kuro Box. If you've been following along with this series, by this stage you already have a Kuro Box with a completely functioning GCC build environment. However, if you're just browsing these articles rather than carrying out all the steps, please note that you don't actually need any special hardware components to test out most of the code discussed in this article. All you really need is a functional C compiler and linker, and some kind of machine running a CGI-compliant Web server.

Why Web?

Before diving into the details, you might ask: Why would I want to implement a Web interface on my appliance? Here are a few possible answers to that question:

  1. You can present an attractive graphical configuration system for your appliance without having to write a single line of GUI code. This means a faster time to market and less maintenance work, since you don't have a proprietary setup app that needs to be fixed every time Windows® or MacOS changes.
  2. The device can be managed remotely as easily as it can be configured locally.
  3. Modern consumers of high-tech products are comfortable with the Web browser interface and do not require special training; your documentation needs, and more importantly, your tech support load, are dramatically reduced.
  4. If your device has a GUI, or is at least capable of supporting a Web browser (practically any reasonably-sized graphical display with touchscreen or a few buttons meets these criteria), you can simply put a Web browser such as elinks (see Resources) on the appliance, and present a functionally identical user interface locally on the device as well as remotely. This technique is terribly under-utilized; many devices have redundant, totally proprietary user interfaces on their local console.

The three distinct components to an HTTP-based embedded user interface are:

  1. The Web server itself. In the Internet world, and particularly on Linux™ machines, one of the most popular servers is Apache (see Resources). The Kuro Box is certainly powerful enough to run Apache, but it's a rather heavy solution for embedded applications. There are other, leaner Web servers, such as thttpd and GoAhead, targeted specifically at embedded applications.
  2. A set of Web pages containing forms for remote users to submit data, and templates for displaying data generated by the appliance being administered
  3. A back-end program (or, more often, a set of programs) to receive submitted form data from the Web server, process it, and use the aforementioned templates to format the output returned to the remote user

Note that it's possible to build all of the functionality into the Web server itself, thereby creating what amounts to a monolithic, specialized, proprietary network server application that just happens to look like a standard Web server to the outside world. This technique is sometimes used on very deeply embedded devices (particularly when the device has no internal file system and the developers don't want the hassle of creating a pseudo file system). The monolithic approach isn't very flexible, however, and it's not usually employed on powerful platforms like Kuro Box.



Back to top


A look at the status quo

Buffalo Technology provides part one of the puzzle in the form of a working, preloaded version of thttpd. It also provides a set of Web pages, forms, templates, and scripts covering parts two and three. You'll find all these materials in the /www directory on the hard disk. The default set of page data is all in Japanese; you can upgrade to the English versions available from Revolution, but you won't be using them at all in these articles so don't bother to do the upgrade unless you're curious. To seasoned, devil-may-care hackers, the Kuro Box Web interface is a grossly sugar-coated route into the stuff you'd rather access through telnet.

Restarting thttpd

While it's perfectly reasonable, in development, to simply reboot your system to update configuration changes, it's really not the best way most of the time. This isn't Windows.

While most UNIX® daemons can be persuaded to restart, reloading their configuration files, on a SIGHUP, that was one of the features thttpd jettisoned in favor of being tiny. As a result, restarting the Web server requires you to kill the existing running copies and start the server directly.

In many cases, a daemon will store its process ID (pid) in a file so you can kill it easily, but would-be evil overlords shouldn't shy away from overkill; killall thttpd will kill every last thttpd process. You may cackle, or mutter "they thought I was bluffing," before entering /usr/sbin/thttpd -C /etc/thttpd.conf.

Implementation note: The /www directory lives by default on hda1, which is a relatively small partition -- 2GB on a 100GB disk -- that also contains all of the system files. If you get into complicated Web serving experiments, you may need to move the Web directory onto hda3 (the main user partition, spanning most of the disk) in order to have more space for your files. hda3 is mounted at /mnt. Simply telnet into the Kuro Box, cp -R /www /mnt, and then edit /etc/thttpd.conf -- change the line that reads dir=/www to read dir=/mnt/www instead. You'll need to restart thttpd for this change to take effect; the simplest way to achieve this is to reboot the Kuro Box with the shutdown -r now command. The remainder of this article assumes your box is still in its default configuration.

There are several protocols you can use to communicate between a Web server and an external program. The method used in these articles is the Common Gateway Interface (CGI). This is the oldest and most widely-supported (read: best-documented and most portable) interface method. You can read more about the details of CGI in the Resources section, but in a nutshell, it specifies that options embedded in the URL are passed to the program as command line arguments, certain global information is passed to the program in environment variables, data sent by the remote browser is passed on the standard input channel, and the Web server expects to see the external program deliver its output on the standard output channel.

Now, CGI only specifies the methods for passing data between a Web server and external programs. It does not specify anything about how those programs should be structured or what language should be used to write them. A large percentage of Web servers on the real Internet, and even in some embedded appliances like Kuro Box, use a mixture of Perl scripts and HTML template snippets to perform this functionality. While Perl is not quite an arcane black magic skill, it's definitely not in the skill set of most embedded developers. It's also not the most efficient technology from the standpoint of disk (or flash) footprint, RAM requirements, and runtime speed. Therefore, in this article, you'll be developing your CGI backends in C; this makes the code more generally applicable to embedded products and more immediately comprehensible to people from an embedded background.

The first program you'll build -- it's a technology demonstration of sorts -- is a "hello world" application that verifies the end-to-end connection between your Web browser, the Web server on the Kuro Box, and your custom back-end program. This is a trivial piece of C code:


Listing 1. A trivial CGI program

#include <stdio.h>

int
main(int argc, char *argv[])
{
	printf("Content-type: text/html\n\n");
	printf("<HTML><BODY>Hello, HTTP world!</BODY></HTML>\n");

	return 0;
}

You'll find this program, with its makefile, in the hellocgi directory when you unpack the sourcecode tarball from the Resources. Once you've built the hello.cgi executable, you need to put it inside the /www/cgi-bin directory so that it will be accessible. Once you do this, open a Web browser on another machine and go to the URL http://a.b.c.d/cgi-bin/hello.cgi (where a.b.c.d is the IP address of your Kuro Box). Your browser will display the "Hello, HTTP world!" text. Congratulations -- you just built your first CGI application!

Take careful note of the location where I told you to copy the executable. For security reasons, Web servers usually have a lot of restrictions around CGI executables. If you look at /etc/thttpd.conf you'll see there is a line reading cgipat=/cgi-bin*/*. This restricts the server to run only CGI executables located in a subdirectory, the name of which begins with "cgi-bin" -- for example, /cgi-binary, /cgi-bin, /cgi-bin2, and so on. Observe that the leading slash references the root directory of the Web server as defined in thttpd.conf, not the actual root directory of the operating system environment as a whole.

Now you'll build a more useful example that actually processes an input form and shows you how to extract data from the results. You can find the code for this example in the cgidump directory after unpacking the example code tarball. I won't reproduce the code here for space reasons, but I'll go through the steps it takes (omitting some error checking and output prettification):

  1. Spit out a header so the browser knows what you're sending it. This just sends the Content-Type:text/html string and the <HTML><BODY> opening stanza.
  2. Go through the command-line arguments argv[0] through argv[argc-1] and print them to standard output.
  3. Go through all the environment variables in __environ[] and print them to standard output. Some of these environment variables were set by thttpd (refer to the CGI spec), and others (such as PATH) were inherited directly from the operating system.
  4. Get the (string) contents of the CONTENT_LENGTH environment variable and convert to an integer. This variable will be defined if the script was called from a form. It will tell you how many bytes of data to expect on stdin. (Note that Web servers are not required to send an EOF on standard input. The only safe way to handle the standard input stream to a CGI program is to check CONTENT_LENGTH and read exactly that many bytes).
  5. Read CONTENT_LENGTH bytes from standard input and echo them out to standard output.

This CGI program is designed to be "fed" with data by the POST method. A simple example form, example.html, is included with the source code for the program. Build the cgidump.cgi executable, copy it to /www/cgi-bin, then copy example.html to /www. Now invoke the URL http://a.b.c.d/example.html (a.b.c.d being your Kuro's IP address), and you'll see a text-entry area and a Submit button. If you erase what's in the text-entry area and type <I typed 100% of this text>, then click Submit, you'll get the following output on your browser:


Listing 2. Output from CGI script

   Command-line arguments:
   Argument 0: 'cgidump.cgi'
     _________________________________________________________________

   Environment variables:
   Environment 0: 'PATH=/usr/local/bin:/usr/ucb:/bin:/usr/bin'
   Environment 1: 'SERVER_SOFTWARE=thttpd/2.23beta1 26may2002'
   Environment 2: 'SERVER_NAME=KURO-BOX'
   Environment 3: 'GATEWAY_INTERFACE=CGI/1.1'
   Environment 4: 'SERVER_PROTOCOL=HTTP/1.1'
   Environment 5: 'SERVER_PORT=80'
   Environment 6: 'REQUEST_METHOD=POST'
   Environment 7: 'SCRIPT_NAME=/cgi-bin/cgidump.cgi'
   Environment 8: 'REMOTE_ADDR=192.168.0.8'
   Environment 9: 'HTTP_REFERER=http://192.168.0.7/example.html'
   Environment 10: 'HTTP_USER_AGENT=Mozilla/5.0 (X11; U; Linux i686;
   rv:1.7.3) Gecko/20041020 Firefox/0.10.1'
   Environment 11:
   'HTTP_ACCEPT=text/xml,application/xml,application/xhtml+xml,text/html;
   q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5'
   Environment 12: 'HTTP_ACCEPT_ENCODING=gzip,deflate'
   Environment 13: 'HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.5'
   Environment 14: 'CONTENT_TYPE=application/x-www-form-urlencoded'
   Environment 15: 'HTTP_HOST=192.168.0.7'
   Environment 16: 'CONTENT_LENGTH=61'
   Environment 17: 'REMOTE_USER=root'
   Environment 18: 'AUTH_TYPE=Basic'
   Environment 19: 'TZ=JST-9'
   Environment 20: 'CGI_PATTERN=cgi-bin*/*'
     _________________________________________________________________

   Standard input (stdin):
   61 bytes of content.
   TextBox=%3CI+typed+100%25+of+this+text%3E&SubmitButton=Submit
     _________________________________________________________________

Observe how the text you typed into the text-entry area was processed into a "URLized" format (as denoted by the environment variable CONTENT_TYPE=application/x-www-form-urlencoded, by the way). The rules for decoding this format are extremely simple:

  1. Replace all + signs with spaces. For example, "Fred+sits+down" becomes "Fred sits down".
  2. If you encounter a % sign, this signals a character escape sequence. The next two characters in the input stream are the ASCII code of the character that should be inserted in the post- processed output stream. For example, "%3CHTML%3E" becomes "<HTML>"

You can also feed the program with query inputs in the standard syntax of url?query1+query2+.... Query items are given to the CGI program on its command line. For example, if you invoke the URL http://a.b.c.d/cgi-bin/cgidump.cgi?option1+option2 (again, a.b.c.d is your Kuro's IP address), you'll see that the output shows two new command-line arguments: argument 1 is 'option1' and argument 2 is 'option2.' It is very important to remember that these options must be encoded in the same URLized format as described above. For instance, "option1=yes+option2=no" won't work; you need to use "option1%3Dyes+option2%3Dno".

The above text isn't the whole story about Web interfaces, by a long shot. With the advent of modern browser technologies such as Java™, JavaScript, and others, you now have the possibility of pushing some of your administration application's workload into the browser. Depending on how you look at it, this is either a fantastic way to offload work onto the user's PC, or an astoundingly ill-advised way to introduce subtle and potentially unfixable bugs into your system.

As a general rule, I try to eschew active content in HTML code that is designed to be served up by an embedded appliance, unless at least one of the following conditions is true:

  • Said appliance is severely resource-constrained and you must run some of the processing on the client machine for memory or performance reasons (this applies most often on 8-bit platforms, which frequently use unequivocally brilliant software hacks to serve up data much bigger than their available RAM)
  • The end-user needs to see some animated real-time information that would require undue bandwidth between embedded appliance and browser (for example, a navigable 3D rendering of the thermal map of a machine part), or
  • You have special security needs. For example, some applications might have a Java applet running on the end-user's machine to encrypt and decrypt data that is being sent to and from the embedded appliance. This approach is typically used when secure sockets (SSL) are considered insufficiently secure, where the Web server doesn't support them, or where you need to retrofit guaranteed high security to a browser that is untrustable or crypto-crippled due to government restrictions.

These exceptions are relatively rare, and the first one will practically never apply to a PowerPC® appliance. In fact, at the risk of making an unwelcome religious pronouncement, I'd advise you to avoid relying on any sort of active code in embedded Web content. Feel free to use, say, JavaScript to validate input fields before you submit them, but make it possible for people to use your user interface even if they have all the active content features disabled in their browser. The gold standard I employ is that a Web-based user interface is only fieldable if it can be operated from a vanilla version of the text-based lynx browser (see Resources).

In much the same vein, you should also avoid functional reliance on icons and graphics. If you must use graphics as action buttons or links, make sure that you set the ALT="" text appropriately so that users without graphics capability can still work out what each user interface element does.

This goes without saying, but using an OS- or browser-specific, or proprietary technology such as VBScript, ActiveX, or Macromedia Flash is utterly taboo. Don't even think about doing it. These sorts of technologies automatically exclude various classes of users, and are undesirable for that reason alone. However, what's much worse is that you're introducing a functional dependency on a product you don't control. Bugs or behavioral changes can crop up in those proprietary runtimes. It's frighteningly easy to get into situations where you have to field dozens of tech support calls and maybe even revise your product's firmware merely because of changes in proprietary browser plugins. You might then have to jump through bizarre flaming hoops to make sure users are using the right browser or plugin version for the particular firmware they have loaded.

Part of the reason for using a Web interface is to make talking to your embedded appliance OS-agnostic. Designing in a reliance on proprietary third-party software runs directly counter to that goal.

Next month's article puts all the above magic to work in building a sort of Web-based photo-album application. The relevance of this application to unmanned submarines will become obvious in the subsequent installment, so stay tuned!



Resources

  • Participate in the discussion forum.

  • You can download the sourcecode archive that accompanies this article. (Note to Internet Explorer users: In a burst of proprietary madness, IE will attempt to mangle the filename to ibm-article4.tar.tar. You should manually ensure that the downloaded copy has a .tar.gz extension).

  • The sample sourcecode presented in this article series is licensed under the GNU GPL; please feel free to use it in your own projects, subject to the terms shown.

  • Familiar with embedded development, but not with PowerPC? Familiar with PPC, but not embedded? Or familiar with both of these (or neither), but not with Linux? Migrating from x86 to PowerPC offers clear, valuable, hands-on advice to the beginner and professional programmer alike -- and is the only series on the Internet that will help you build your own remote-controlled robot submarine. Check out the other articles in this series.

  • Need to hack a serial port onto your Kuro box? Lewin has posted all of the details to his site.

  • For most embedded projects, Apache is too large. But thttpd (tiny/turbo/throttling HTTP server) and the GoAhead Web Server both offer small footprints and other attractive features.

  • The text-only lynx browser is compatible with practically any operating system that can compile the binary. Here is the lynx sourcecode distribution page.

  • Lynx inspired the minimalist ELinks Web browser. Sleek and functional, this is my favorite browser for embedded applications.

  • Visit the "official" specification jump-off point for the Common Gateway Interface.

  • CGI scripts can make it easy to shoot yourself in the foot: learn how to put the safefty catch on your SGI scripts with the World Wide Web Security FAQ Section 6: CGI (Server) Scripts. The page deals with CGI script security, and includes tips on unsafe practices you should avoid as well as a wealth of additional related resources.

  • HTML FORM Syntax is a concise introduction to the syntax of form tags in HTML code. If you're coding the HTML for your application manually, you should read this page and probably print it out for easy reference.

  • Another good reference is the JimPrice.com ASCII Chart, suitable for framing and hanging on your wall while you're debugging Web forms. Well, or at least hanging it without a frame, for reference.

  • Have experience you'd be willing to share with Power Architecture zone readers? Article submissions on all aspects of Power Architecture technology from authors inside and outside IBM are welcomed. Check out the Power Architecture author FAQ to learn more.

  • Have a question or comment on this story, or on Power Architecture technology in general? Post it in the Power Architecture technical forum or send in a letter to the editors.

  • The Power Architecture Community Newsletter includes full-length articles as well as recent news about members of the Power Architecture community and upcoming events of interest. Learn more about the Power Architecture Community Newsletter and how to contribute to it. Subscription is free.

  • All things Power are chronicled in the developerWorks Power Architecture editors' blog, which is just one of many developerWorks blogs.

  • Find more articles and resources on Power Architecture technology and all things related in the developerWorks Power Architecture technology content area.

  • Download a IBM PowerPC 405 Evaluation Kit to demo a SoC in a simulated environment, or just to explore the fully licensed version of Power Architecture technology. This and other fine Power Architecture-related downloads are listed in the developerWorks Power Architecture technology content area's downloads section.


About the author

Lewin A.R.W. Edwards works for a Fortune 50 company as a wireless security/fire safety device design engineer. Prior to that, he spent five years developing x86, ARM and PA-RISC-based networked multimedia appliances at Digi-Frame Inc. He has extensive experience in encryption and security software and is the author of two books on embedded systems development. He can be reached at sysadm@zws.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top