Middleware: Loose coupling for application components
In my previous article (see Resources), I introduced XML-RPC (see Resources) as an easy way to execute functions on remote machines. However, this function alone isn't useful enough to make the protocol worth learning. Instead, we will look at one interesting application of Web services in conjunction with XML-RPC: middleware.
At its most basic level, middleware is nothing but a way of abstracting access to a resource through the use of an Application Programming Interface (API). Both functional and object-oriented programming use APIs to simplify the details of manipulating a software resource, like a user authentication database or a filesystem, for the convenience of the programmer. Clearly, users calling the API benefit from this abstraction because they are shielded from the details of the underlying implementation. What may not be as obvious is that the API implementers also benefit from this abstraction because they can radically alter the underlying library code without breaking programs that used the old implementation.
Software libraries that enforce this independence between implementing and calling code are considered loosely-coupled. If calling code ignored the API and manipulated data structures in the library code, this independence would be lost since a change in the library's implementation might break the misbehaving client. Such a system is said to be tightly-coupled. Good software engineering practice calls for systems to be loosely-coupled wherever possible.
The idea of APIs can be extended to larger concepts beyond programming libraries. Of course, an application that allows users to graphically modify the contents of a database must itself be able to manipulate the raw data. Databases like MySQL and PostgreSQL provide a C interface for doing just that. If this hypothetical application directly uses the native interface of a particular relational database management (RDBM) system, switching to a new RDBM system involves reworking all the code that talked to the old system. By introducing a middleware layer that brokers all communication between the display logic and the routines needed to handle the specific underlying database (as shown in Figure 1), programmers can make radical changes on either side of the API fence without necessitating reciprocal changes on the other side.
Figure 1: Typical Middleware setup

Centralizing Web accounts: A real example
Given this brief theoretical background, let's solve a real world problem with XML-RPC middleware.
Imagine a Web site with user accounts that requires logins. Certainly, Apache's built-in HTTP authentication could be used to store the user names and passwords. Frequently, other account specific data that has to be available once the user has been authenticated must also be stored. This argues for storing account information in an RDBM system like MySQL (see Resources). If the front end is written in Perl or PHP, it is tempting to hard code access to MySQL in the CGI scripts. However, doing this tightly couples the display logic with a particular RDBM system. And this is where middleware comes in. Middleware creates a flexible system where the underlying RDBM may be seamlessly changed, and additionally allows a single database server to be replaced with a cluster of machines. Thus, the loose-coupling created by using middleware is the key to creating a scablable Web application architecture.
In order to build a middleware bridge between the front-end CGI display code and the back-end database and business logic code, an API that allows all necessary data access must be defined. Table 1 shows the XML-RPC middleware API for this application's user accounts data store. The front-end code can only get to the underlying data by making these XML-RPC calls.
Table 1: XML-RPC API for account info
| Host: | http://marian.daisypark.net/RPC2 | Port: | 1080 |
| Procedure Definitions | |||
| Procedure Name | Input | Output | Description |
| authenticate | <string>, <string> | <int> | Given a username and password, will return a new session ID if credentials match |
| get_account_info | <int> | <struct> |
Given
a valid session ID, will return the account information for the user
associated with the ID; the structure will have the following fields: username fullname points |
| set_account_info | <struct> | <int> |
Given
a structure containing the following fields: username fullname password points will create a new account, if the username is new, or update an existing account if not; returns 1 for success, 0 for failure |
When users first try to login on this fictional site, they are challenged
with an HTML form asking for their user name and password (see Figure 2). When the users hit the submit button, the CGI script processing the form will make an authenticate()
XML-RPC call to the Perl listener. If the credentials of the users are valid, the
front end will receive a session ID that can be used to track the users through
the site or retrieve other account information. In this example, after the users
login, they are transported to a page that allows him to update their account
information.
Figure 2: HTML form for ID and password

Few languages are better for developing front-end Web code than PHP. Listing 1 shows the PHP code that produces the login screen shown in Figure 3.
Figure 3: Logging in with PHP

Those new to PHP should have a look at the PHP homepage (see Resources). Like other server-side include technologies,
such as Active Server Pages and ColdFusion, PHP blurs the line between static
pages and dynamic content. The login page code, see Listing
1, begins by pulling in the XML-RPC PHP library. If the user is submitting
his user name and password, then a new xmlrpc_client object is initialized
with the URL and TCP port number. In the case of PHP, the arguments to this
object initializer are path, host and port.
Also like the Perl library, PHP allows the programmer to look at the underlying XML-RPC conversation for debugging. Lines 6 and 7 create the objects for the arguments to the RPC. Each argument needs to be wrapped in a special object that allows the PHP library to correctly encode the value in XML.
Line 8 creates a new xmlrpcmsg object that encodes the contents of the RPC call. This object should be given the name of the remote procedure followed by the list of arguments all wrapped
up into one PHP array object. Finally on line 9, the connection is made with
the XML-RPC listener and a response object is returned. You can retrieve the value
that the listener returned by first getting the xmlrpcval object stored in the
response object and then asking for the xmlrpcval object's scalar value. Why a
scalar value? Recall the API. It states that authenticate() returns some kind of
integer. If this integer is non-zero, it is a session ID and the user is
redirected to the account maintenance screen with the session ID tagged on to
the URL. If authenticate() fails, a failure message is issued and users are given another chance to login.
Listing 2 comes not from XML-RPC, but SQL. The reader should already have some familiarity with Perl's DBI module. If not, do read the MySQL mainpage (see Resources), which provides a complete introduction to using this module.
After pulling in all the libraries on lines 4-7, a global DBI handle, $Dbh, is
created that all of the subroutines will use to access the database. The END()
subroutine executes when the script terminates and ensures that a DBI error
message about not explicitly disconnecting from the database doesn't occur. In
Lines 15-22, the published API procedure names are associated with the Perl
functions implementing them.
Line 36 shows how the authentication is done. The function expects two
scalars; they are checked to make sure they are at least the size and
shape of what's expected. The SQL table users is queried to see if the
credentials match. Listing 3 shows the SQL code that
created this table.
Listing 3: SQL that created the users table
create table users (
username char(12) not null default '',
password char(13) not null default '',
fullname char(50) not null default '',
points int default 0,
Primary Key (username),
index (fullname)
);
|
An SQL statement is prepared that looks for rows where there is a username matche.
Since the username is a primary key, there will be at most one matching row.
The one column returned is called "password" and it is a DES encrypted string.
Perl's crypt()
function is used to DES encrypt the $password that was passed in and this
scalar is compared to what was found in the table. If these strings match, this
event is logged to another MySQL table via the logger() function and a new
session ID is generated. The new_session_id() function on line 210 is simply
inserting the username into the sessions table. Listing 4 shows the SQL code that created the sessions table.
Listing 4: SQL that created sessions table
create table sessions (
id int auto_increment not null primary key,
username char(13),
issued timestamp
);
|
The session ID is simply an auto_increment field. Every time an insert is performed, a new, unique ID, and a timestamp of when the insert occurred is created.
What's great about MySQL is that using the DBI handle's mysql_insertid attribute makes it is easy to identify the last insert
ID used. This integer is then returned to the XML-RPC client.
Now that the mystery of how the session ID is generated has been uncloaked, let's
look at a function that uses it. On line 66, get_account_info() begins. It expects
to be passed a valid session ID. Recall that the session ID belongs to table sessions, but
the desired information is stored in the users table. The information needed can
be obtained by using the SQL JOIN operator. Both sessions and users have a
field called "username." By finding the row in sessions that has the session
ID, the username of that session is also found. This username is then used to
find the row in users
that contains that username. If this succeeds, that row is retrieved as a hash
reference that can be returned to the caller. If this SQL is a bit too fancy,
simply use two small SELECT
calls and join the rows manually with a Perl hash.
The last function is set_account_info()
and it expects a structure. This XML-RPC structure is represented in Perl as a
hash reference. Because this function is going to INSERT or UPDATE, the users table is
locked to prevent a race condition happen between these two SQL operations.
Ideally, we could have used the latest version of MySQL since it supports
transactions, but there are still many older installations of MySQL around
and it is useful to understand how transactions were historically handled in
this RDBM.
DBI's do() method returns the number of rows affected by whatever SQL statements were executed. This return value can be used to figure out if an account update or creation is
needed. Remember that only existing users can change their full name or number
of points. The "username" field is a primary key and changing it should be done
in a more controlled manner. The password arguably could be changed here, but
it seems like it would be better to isolate this in another function. Once the
update is completed, the table is unlocked and this transaction is logged with logger(). If a new account needs to be created, the tables are unlocked and this information
is INSERTed after encrypting the password. Again, the success or failure of this operation is returned to the XML-RPC client.
If the number of lines of code for the SQL, XML-RPC listener, and client are counted, this middleware system was weighs in under 600 lines of code. XML-RPC is a powerful and simple message passing system that excels in the sort of application described in this article. The successor to XML-RPC is the Simple Object Access Protocol (SOAP) (see Resources), which extends the some of functionality seen here. SOAP will be my focus for the next series of articles. For more information about XML-RPC, have a look at the homepage (see Resources).
- Check out Joe's first article dealing with Web services, XML-RPC, and Perl: Getting Started with XML-RPC in Perl.
- Get a complete overview of XML-RPC by visiting the XML-RPC homepage.
- Review the documentation for PHP.
- Visit the MySQL homepage for more complete information.
- Learn more about SOAP by reading the SOAP spec.
By day, Joe Johnston (jjohn@cs.umb.edu) is a programmer for O'Reilly Labs, a new department at O'Reilly and Associates. Whenever his cat isn't sitting on his keyboard, he writes articles for The Perl Journal, use.perl.org, www.perl.com, and the O'Reilly Network. Along with Michael Lord, he created the humorous UFO folklore site, Aliens, Aliens, Aliens.





