Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

The Python Web services developer: Messaging technologies compared

Choose the best tool for the task at hand

Mike Olson, Principal Consultant, Fourthought, Inc.
Photo of Mike Olson
Mike Olson is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, and 4Suite Server, open source platforms for XML middleware. You can contact Mr. Olson at mike.olson@fourthought.com.
Uche Ogbuji (uche@ogbuji.net), Principal Consultant, Fourthought, Inc.
Photo of Uche Ogbuji
Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, and 4Suite Server, open source platforms for XML middleware. Mr. Ogbuji is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Summary:  Choosing between technologies always involves trade-offs; often you sacrifice performance to gain ease of programming. Perhaps the realm of most interest to Web services developers is messaging technology. How can you balance speedy performance with human readability? Mike Olson and Uche Ogbuji don't claim to have the answer to this question, but they do offer some hard data to help you make the decision that best fits your needs. In this article, they help you compare some of the different messaging protocols available. You will write a simple application for each protocol and compare various measurements of speed, message overhead, and relative development time.

Date:  03 Jul 2002
Level:  Introductory
Also available in:   Japanese

Activity:  16955 views
Comments:  

There are a number of messaging technologies available to Web services developers; some offer high performance, others ease of use and human readability. In this installment of The Python Web Services Developer, you will look at four different messaging technologies and write a simple benchmark application for each. When you are done, you will gather some basic statistics from each application for a simple comparison between the technologies. The four technologies that you will examine in this column are SOAP, CORBA, XML-RPC, and good, old-fashioned low-level sockets.

In this article, we will demonstrate a simple client-server application that sends three different messages. The first message is in the form of a string sent from client to server; from this, you can gather information on timing and message overhead. The second message that you create is the opposite of the first: you receive a string from the server. Lastly, you will send an integer to the server. This will allow us to gather timing information as well as information on message overhead for binary-based messages.

Be careful not to draw too broad a conclusion from the analysis presented in this article. The four technologies that we'll examine, while all suited for a message-based client-server application, differ greatly in their strengths and weaknesses. To choose a messaging technology that's right for you, you need to do more than just compare messaging overhead directly. As we look at each technology in more detail, we will attempt to uncover some of their important differences and allow developers to make a more complete choice as to the technology to use.

Comparison methodology

To gather the timing information, you will simply use the time.time() function call available with Python. As you review the statistical data generated, keep in mind that time.time()'s accuracy is platform dependent and may in some cases have no better than one second of precision.

As you execute an application with each different technology, you will want a standard way to see how much information is actually sent over your TCP connection. To do this, you will use a tool called tcpdump (see the Resources section below for a link). The process for downloading and installing tcpdump will vary based on your OS, so we will not cover that here. The documentation on the site is decent, and at any rate, you won't need tcpdump to actually run the example code, just to test for message overhead by gathering the messages sent to a specific TCP connection.

When you are testing the applications, you will use the following options on tcpdump to siphon the messages off into a temporary file:

tcpdump -i lo -ae -s 0 -w /tmp/packets.txt

There are many other options that you can use (you could filter by port or hostname, for example) but for this simple case these will suffice. The -i lo option tells tcpdump what interface to listen on (the loopback device, in this case). The a option attempts to convert addresses to names. The e option includes link-level header information in the output. s 0 tells tcpdump that you want all data in every packet body. (If you want tcpdump to keep only a set number of bytes per packet, enter the number of bytes desired in place of zero here.) The last option tells tcpdump to copy all packets reviewed by the machine to a file called /tmp/packets.txt. Note that the command line above is for a Linux machine, and that you may need to run tcpdump with root or administrator permissions.


Preparing your system

There are a few third-party packages you need to have running on your system before you can run the test suite. We'll cover each briefly; if you already have these packages installed, you can skip the following sections.

Installing omniORB

The CORBA ORB that you are going to use is called omniORB (see Resources for a link where you can download it). For this example you'll use omniORB version 3.04 and omniORBPy version 1.4. For non-Unix platforms and some versions of Unix there are binaries available. If you are building from source, then the appropriate README file will help; the basic steps follow. Note that if you are installing from binaries then you will still need to update your PYTHONPATH and PATH environment variables as shown in steps 7 and 8.

  1. Download the source code and untar it
  2. Uncomment your platform in omni/config/config.mk
  3. cd omni/src && make export
  4. Download omniORBPy (from the same site; see Resources)
  5. Untar it in the omni/src/lib directory
  6. cd omni/src/lib/omniORBPy && make export
  7. Update you PYTHONPATH environment variable to include $OMNI_HOME/lib/python and $OMNI_HOME/lib/i586_linux_2.0_glibc2.1
  8. Update your PATH to include $OMNI_HOME/bin/i586_linux_2.0_glibc2.1

With omniORB installed, you need to generate the stub and skeleton files for the example. Download and unarchive the example from here. Then change into the src/omni directory and execute:

omniidl -bpython server.idl

Installing XML-RPC

If you are using Python version 2.1 or later, then you can skip this section. However, if you are using an earlier version of Python, or you just want to get the latest xmlrpclib, then you should download that package; see Resources for a link. Installation is simple: just create a directory and unzip the archive to a location on your Python path. Don't forget to add a __init__.py to the new directory so it can be imported as a Python module.

[molson@penny python]$ mkdir XmlRpc
[molson@penny python]$ cd XmlRpc/
[molson@penny XmlRpc]$ unzip ~/downloads/xmlrpc-0.9.8-990621.zip
Archive:  /home/molson/downloads/xmlrpc-0.9.8-990621.zip
  inflating: README
  inflating: xmlrpclib.py
  inflating: xmlrpcserver.py
  inflating: xmlrpc_handler.py
[molson@penny XmlRpc]$ touch __init__.py
[molson@penny XmlRpc]$

Installing ZSI

The last third-party library that you'll need is a SOAP implementation. For your benchmarking purposes you will use ZSI; see Resources for a link to the latest version. Once you've downloaded it, unarchive it and execute python setup.py install, and you'll be ready to use it.


The benchmark applications

Now that your system is prepared, you can use your benchmark applications to test each technology in turn.

Python sockets

Let's look at raw Python sockets first. You can consider this your control, as there is no protocol overhead (except what you add yourself). That's the good part about programming raw sockets; the downside is that you'll have many other nontrivial issues that you need to worry about, such as:

  • Much more code to develop and maintain.
  • Lack of data integrity over the wire (though only CORBA provides more than TCP itself).
  • Low-level error conditions (that is, wrong message protocol, dropped connection).

So, let's get into the application. All of the example code for this application is available in the Resources section. After you have downloaded the archive and expanded it, you will have four subdirectories. The first one to look at is called python. One of the benefits of writing your own protocol for TCP communications is that there are no additional third-party libraries that you need to install. This can minimize distribution headaches.

In the example distribution, the file python/server.py is the server you will use to receive raw TCP requests. It is based on the standard SocketServer module that comes with Python. There are a few things to note when you look at the code. The first is that there is absolutely no error checking. For every message type (and for nonexistent message types as well) the client needs to thoroughly check every byte received from the server. The second thing to note is that the handler is designed to handle a single request. This will put your little socket server at a disadvantage when you start sending a lot of integers really quickly because the client will have to create a new connection for every integer that it sends. You could have written this server to handle multiple requests in a single connection, but then you would have had to add additional message type support to your server in the form of start and stop messages so that the server would know when to stop listening on a connection. The last thing to note about the server file is that there is no form of connection time out. With the way that this server is configured, a malicious client could connect to the server and send one extremely large string, effectively bringing the entire server to its knees.

To start the server, simply execute the script server.py in one window and the test client, called time-client.py, in a second. This will print out timing information to the terminal that executed the client script. When the client is done, you can press Ctrl-C to stop the server. To test the message overhead, you will use tcpdump; run it in a third window, then run the script size-client.py (with the server still running). The results of these two scripts are shown in Table 1 below.

SOAP

The second technology we'll look at, SOAP, should be the most familiar to readers of this column. SOAP is the newcomer to this set of technologies, however, only coming to the scene within the last couple of years. It (generally) uses HTTP for its transport, creating XML-based messages that are sent as the HTTP body. From the get-go you can already see that there will be some overhead in the SOAP messages, from both the HTTP and the XML. However, this should not always be considered a bad thing as it tends to make the message more readable to humans.

In the example distribution, there is a soap subdirectory. It contains the files server.py, time-client.py, and size-client.py. These are executed exactly as their counterparts in the raw socket application. Again, you'll look at the results in Table 1.

Common Object Request Broker Architecture (CORBA)

Of SOAP, CORBA, and XML-RPC, CORBA has been around the longest and has tackled some of the more difficult issues of distributed programming: concurrency, transactions, security, and authentication. SOAP has made starts on these and XML-RPC requires that you, the programmer, handle these issues. (Actually, CORBA itself does not provide these features; however, the protocol is designed to support these features, and most ORBs will come with some or all of these features implemented through the CORBA Services specification.) The second benefit to CORBA (over SOAP or XML-RPC) is that its message format is binary, though this is in some ways a hindrance as well. It is nice because CORBA messages are much smaller than SOAP or XML-RPC messages and have much less overhead. The downside is that if you ever need to debug a CORBA message, you're in trouble: only a GIOP server can understand it.

In the example distribution there is a CORBA subdirectory. Execute the scripts here just as you executed their counterparts in the previous two examples. The results have been tabulated in Table 1 below.

XML Remote Procedure Call (XML-RPC)

Lastly, let's look at the XML-RPC protocol. Many people have commented to us that XML-RPC can fill most of the roles that SOAP does. However, we think it falls short in several aspects, including Unicode support and flexibility of data typing. XML-RPC is a much simpler protocol than SOAP, which makes it easier to use; however, SOAP is working towards solving many of the issues that CORBA has already solved (concurrency, transactions, etc.) while XML-RPC seems to be content where it is at the moment, as most XML-RPC developers are now working on SOAP.

In the example distribution there is a xml-rpc subdirectory; the examples are run as before and the results are all tabulated in the next section.


The results

The data from your tests of all four messaging technologies are summarized in Table 1.

Table 1. Results of the benchmarking applications

TechnologyConnect timeSend string (21,000 characters)Receive string (22,000 characters)Send 5,000 integersClient LOCServer LOCActual message size sending 1,000 charactersActual message size sending 100 integers
Raw sockets0.0022420.0013770.0013596.74067457252,27985,863
CORBA0.0007340.0046010.0021881.52379937182,09027,181
XML-RPC0.0070400.0827550.050199100.33721929174,026324,989
SOAP0.0006100.2941980.2793411,324.29674232104,705380,288

Wow! Human readability sure does come at an expensive price. The SOAP and XML-RPC messages are just over 14 times as large as the binary CORBA messages. This is not to say that large messages are horrible in today's world, where T1s connect most businesses and 256 Kbps DSL connections connect many households; however, as you can see from the test where you sent 5,000 integers to the server, SOAP and XML-RPC took 882 and 66 times longer than CORBA on the same machine, respectively.

One statistic that did come initially as a surprise is that the CORBA server had smaller messages and was faster than the raw sockets implementation. The reason for this (as mentioned earlier) is that with raw sockets you need to establish a new connection for every request that you make to the server. When connecting to a server over TCP there are several messages sent back and forth to create the connection. Three message are sent from the client to the server requesting a connection, and three acknowledgements are sent from the server. When you do this 5,000 times, these little messages add up. CORBA (and most current HTTP servers) get around this by using connection pooling.

More than just numbers

There are other considerations to take into account that are not easily quantified in a statistical analysis. Each messaging technology has different attributes associated with it that make it more appealing in some situations and less in others. Take raw sockets, for instance. In general, raw sockets require the programmer to do the most work in creation, debugging, and maintenance of the communication code. However, if the application that you are creating is very sensitive to overhead, or if you are reluctant to leave the implementation of the communications layer in the hands of a third party (by using a messaging technology library), then the amount of work required to create and maintain a custom communications layer becomes less and less of an obstacle.

Although CORBA is the fastest of the protocols that you looked at in this column, it also has the steepest learning curve, and will most likely have the largest memory footprint (depending on the CORBA implementation that you use, of course). Also, even though there are many open source implementations of the CORBA specification, to get many of CORBA's must-have features, like CosTransactions, CosSecurity, and CosConcurrency, you will either have to implement them yourself, or purchase a commercial CORBA implementation; the latter will likely cost tens or hundreds of thousands of dollars.

XML-RPC and SOAP fall roughly into the same category. Both feature a great amount of message overhead, and their performance is quite a bit slower than CORBA's; however, for many applications you really don't care about these factors. With the cost of bandwidth, processors, memory, and disk space dropping daily, it is pretty easy to justify spending an additional $5,000 to $10,000 on additional equipment to compensate for any drop in performance. In the process, you would spend less than you would on a commercial ORB, but still have the benefit of human-readable messages. One last thing to note about SOAP and XML-RPC is that they are just based on plain XML. This means that if you support an in/outbound XML-based message structure in your application, no matter where the technology goes in the future, you will have a much easier time of moving your communications infrastructure to another format, as XML can be transformed into just about anything with a bit of XSLT.


Coming soon

In the next installment of this column we will look in more detail at the XML-RPC library that comes standard in Python 2.2. We will write a simple client and server application to show you how to use the library, and compare some of the features of XML-RPC with similar features in SOAP.


Resources

About the authors

Photo of Mike Olson

Mike Olson is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, and 4Suite Server, open source platforms for XML middleware. You can contact Mr. Olson at mike.olson@fourthought.com.

Photo of Uche Ogbuji

Uche Ogbuji is a consultant and co-founder of Fourthought Inc., a software vendor and consultancy specializing in XML solutions for enterprise knowledge management applications. Fourthought develops 4Suite, and 4Suite Server, open source platforms for XML middleware. Mr. Ogbuji is a Computer Engineer and writer born in Nigeria, living and working in Boulder, Colorado, USA. You can contact Mr. Ogbuji at uche@ogbuji.net.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and Web services
ArticleID=11682
ArticleTitle=The Python Web services developer: Messaging technologies compared
publish-date=07032002
author1-email=mike.olson@fourthought.com
author1-email-cc=
author2-email=uche@ogbuji.net
author2-email-cc=