© Copyright International Business Machines Corporation 2002. All rights reserved.
Web services is the latest darling of the computing world, promising to solve all of our computing problems, including platform differences and interoperability with legacy systems. Almost every article mentions that Web services can work across a variety of protocols, but they discuss only SOAP/HTTP in any detail. HTTP is easy to understand and therefore has been most often defined and implemented, but it's clearly not the only choice for Web services.
This series of articles will discuss making Web services available via an implementation of SOAP over Simple Mail Transfer Protocol (SMTP), with the Web service hosted within WebSphere® Application Server. Different approaches are shown, to support different Qualities of Service and exploit existing infrastructures. The articles will not discuss the creation of Web services, but rather their use of SOAP over SMTP bindings for their interfaces instead of the more common SOAP over HTTP/HTPPS.
This series assumes that you're already familiar with SOAP and Web services technologies in general, so we'll spare you the familiar artwork of the triangle showing the Service-Oriented Architecture. Some familiarity with HTTP and SMTP will also help, but is not required.
The examples in these articles use IBM WebSphere Studio Application Developer, using the built-in WebSphere Application Server, but any J2EE-compliant application server (including Tomcat) should work just fine.
There are several reasons why SOAP over HTTP is so common:
- HTTP is ubiquitous -- it's everywhere.
- HTTP is firewall-friendly, using only a couple of well-known ports, and firewalls are almost always configured to pass this protocol.
- HTTP is easily secured, with encryption via Secure Socket Layer for HTTPS, and various certificate types for authentication.
Some of these reasons also apply to SMTP. E-mail is as common as Web browsing -- many of us have multiple e-mail accounts that we check every day. SMTP uses a well-known port, so it's easy to set up a firewall to pass it, and almost every firewall is configured to do so. Encryption is not as common, but is still easily set up using digital signing with PGP or other methods.
What's more, SMTP is asynchronous. The caller can send the request via e-mail, and if the destination server is down, any intermediate servers will retry several times in order to ensure delivery of the e-mail. HTTP on the other hand, will fail if the target server is unavailable at the time of the request.
SMTP offers two ways to get the e-mails:
- Implement an SMTP server, implementing support for the base protocol yourself.
- Use one of the existing "post-office" protocols, where the e-mail is stored on a server and retrieved by some process at its convenience.
You're no doubt familiar with both of these methods, because your e-mail program uses both. Outgoing mail is configured to use an SMTP server, directly sending the message to a server that understands the protocol. Incoming e-mail is typically handled by one of the post-office protocols, either Post Office Protocol (POP), or Internet Message Access Protocol (IMAP), where the e-mail stays on your ISP's server until you retrieve it while connected to the Internet.
Either SMTP or a post-office protocol can be used for a Web service dispatcher interface, receiving requests and passing them on to the target Web service. An implementation of both will be shown, but first let's discuss the strengths and weaknesses of each one. For simplicity, we'll select POP as our intermediate-server protocol, since it's more common and simpler than IMAP.
There are other SOAP over SMTP implementations, such as the one that's in Apache SOAP. However, this approach takes an incoming SMTP request, calls a servlet to receive the request over HTTP, and passes it on to the service's implementation. This article series will implement straight SMTP, without doing a protocol switch that requires another network hop (even if on the same machine) in order to pass the request to the target service.
A direct implementation of the SMTP protocol requires us to set up a listener on the standard SMTP port (25), and then execute the protocol-specified interactions when a caller connects to that port. If the caller connects directly to our server instead of passing through intermediate machines, the Web service client can determine that the message has been received -- otherwise it gets a communications error and can take immediate action on the failure. However, if the server is down, the power of a purely asynchronous request is lost.
Scalability is easy, as the mechanisms for spreading e-mail loads across multiple servers are well understood. As long as multiple servers can handle the requests, any decent routing that will spread the connections across multiple instances of the server will be successful, providing both workload management and failover.
Since the SMTP implementation is listening for any client connecting on its port, any e-mail sent to the server
will be handled, regardless of the target user (think of the To: field in an e-mail).
Therefore the dispatcher routine can easily route requests for various users.
POP holds the e-mails on the server until they're requested. The client program must explicitly delete them. Your e-mail program is probably configured to delete the e-mails after they've been retrieved, but this is a separate request to the server according to the protocol. The dispatcher will periodically poll the POP server to retrieve any e-mails, pass the requests to the target Web service, and then delete them when they've been processed. This process is purely asynchronous, with no direct mechanism to tell the caller that the e-mail has been received or a communications failure has occurred.
Scalability is more complicated than with a pure SMTP implementation, because there must be some coordination between the dispatcher servers in order to avoid multiple servicing of requests.
The POP protocol requires a specific user ID and password in order to connect and retrieve e-mails. If you want to handle requests for a number of different user IDs, you need to poll the POP server for multiple user IDs. This is especially important if the server handles not only Web service requests, but also the response messages on behalf of clients, or makes different routing decisions based on the target user ID.
As always, there are a number of considerations when implementing such a framework. First, it's important to maximize the commonality between our first two implementations (POP and raw SMTP). . Doing similar things in more than one place is a bad practice. Second, you should be as flexible as possible, allowing deployment of multiple objects that can act on a request, and perhaps doing authentication or some other checking before passing the request on to the actual Web service implementation. Another aspect of flexibility is to allow easy extensibility, such as adding support for another protocol such as IMAP.
You also want to leverage other implementations as much as possible. For this reason, this article series will implement the code to run within a JavaTM application server's Web container. For one thing, you may well be supporting SMTP in addition to HTTPS, and therefore putting both implementation into the same run time makes sense. Using an application server also makes scaling, configuration, deployment, and run-time support easier, since you can leverage the existing infrastructure instead of having to invent your own.
Part of this leveraging is the use of JavaMail in order to streamline interactions with SMTP. JavaMail is part of J2EE, so it will be automatically included in any J2EE-compliant application server.
As with any code, ensuring that it's bug-free is critical, so it's important to design in testability, including implementing several JUnit classes to support the automated testing of classes.
You'll also want the dispatcher to process requests as fast as possible, not waiting for a request to be completed before handling another, so a new thread will be launched for each request. Again for simplicity, you can just create a new thread, instead of using a ThreadPool to minimize overhead as you would in an industrial-strength implementation.
Interestingly, there are multiple choices for a SOAP provider, and they're both part of the Apache project!
The most common is Apache SOAP, which supplies a servlet (RPCRouter) to receive incoming SOAP/HTTP requests, decode them as necessary, and pass them to the target Java objects. This is the original implementation, but unfortunately it suffers from one major problem: it uses DOM to perform the XML processing, and so it's slow. Processing the XML used in SOAP can be the slowest and most processing intensive part of a Web services request. Because of this and several other reasons, another initiative was started and has become part of the Apache project -- Axis. It uses SAX parsing for the SOAP message, which gives a simple and faster interface to the XML. Work on the Axis project is ongoing, and it's shipping beta levels of code.
This project uses Axis because it's faster. Theoretically, performance is not as crucial here since we're doing asynchronous processing of the requests. However, that doesn't mean that we can afford to be slow or absorb the extra processing demands of the DOM parser -- we could still drive the server to its knees just parsing the XML, not to mention the processing demands of the Web services themselves.
The high-level design uses load-at-startup servlets. These servlets launch daemon threads that either bind to port 25 and wait for incoming e-mails using raw SMTP, or else periodically poll a POP server for waiting e-mails. Figure 1 shows an oObject interaction diagram of the POP servlet, polling daemon, processor, and handlers:
Figure 1. Object interaction diagram

The daemon threads will do different things depending on which protocol they're handling. For raw SMTP, the thread will get a connection request, then hand the socket to a Processor class to handle the SMTP protocol and retrieve the e-mail. For the POP implementation, the thread will periodically poll the POP server, and hand the individual e-mails to the POP-specific Processor class.
Our Processor classes implement a MailProcessor interface, and can extend an AbstractProcessor class, which contains function common to whichever e-mail protocol we need. These classes get the e-mails and pass them to a chain of Handler objects.
Chains of Handler objects can be configured to perform separate tasks like logging, passing the service requests on to the SOAP engine, or handling responses. Our Handler classes can extend AbstractHandler, again containing common functions.
This article has covered a lot of the background of our SOAP/SMTP implementation, including the benefits of SOAP/SMTP, the pros and cons of various approaches, and the high-level design. The next article will discuss the actual implementation, including getting it up and running in the WebSphere Studio IDE.

Ken Hygh is a Senior Consultant with IBM Services for WebSphere. He has worked with many large customers helping them design, develop, and deploy J2EE applications on WebSphere. He recently co-authored Performance Analysis for Java Web Sites. You can reach Ken at khygh@us.ibm.com.
Comments (Undergoing maintenance)





