Level: Intermediate Martin Brown (questions@mcslp.com), Freelance writer and consultant
01 Sep 2003 Two of the hottest technologies at the moment are Grid computing and Web services, but are the two compatible? In this article, Martin C. Brown looks at how the two systems are actually very compatible and describes the benefits of using Web services in grid applications.
To determine whether Grid computing and Web services are compatible with each other, we need to look at how Grid computing works and whether we can really resolve a typical grid system down into a number of relatively discrete units. The Grid computing architecture relies on fairly basic principles, sending simple requests between clients and servers. Web services rely on processing simple requests from a client to a server.
Just in case you don't see how this can fit into your existing grid structure, this article looks at the two most common grid systems: the request and dispatch architectures. Request systems rely on clients to ask for work, while dispatch systems rely on the broker to supply the clients directly with work. The two systems have different issues when used with Web services, which will be examined as well.
Grid communications
In Grid computing, basically two main component types exist -- the server and the client. The server is used to distribute work requests and holds information about the individual work units that make up the work as a whole. The client (or, more typically, clients) process the individual work units.
The two communicate with each other in a number of different ways, but at the heart of the system is the distribution of work. Again, the system works in one of two ways -- either the clients manage their own work flow and request new work units from the server, or the server distributes work units to the client.
Communication doesn't stop there either; often there will be additional servers and services used to support the grid server infrastructure, and these need to talk to each other and exchange information.
The key point is that often with a grid solution we are exchanging fairly discrete pieces of information. Between the client and the server, the information will be the original work unit and the processed response. Even in a situation where the data load is particularly high, as in data processing or video rendering, we're still exchanging packets of information rather than requiring full-blown, two-way, permanent communication between the client and server elements.
The more radical grid ideas, like the new WebSphere extensions, which allow Web requests to a WebSphere application to be distributed across a grid of WebSphere servers, are another example where both the grid management and the actual work distribution can be handled through fairly simple data exchange.
There are exceptions to the rule, of course. Not all grid systems rely on such a straightforward exchange of simple packets. Resource grids, for example, often rely on fairly heavy intercommunication between grid providers (the clients) to enable requests for storage to be made in real time across the grid. In these situations though, even when communicating directly between clients, it's still a basic exchange of information.
So if we're just exchanging information, surely there must be a standard method to communicate between the servers and the clients. This is where Web services comes into the picture.
Web services overview
Before we can understand how Web services can be used to provide a backbone for our grid solutions, we need to understand how Web services work. The simplest way of thinking about them is as a remote procedure call (RPC) -- a way of calling a function from one computer (the client) where the code and the actual function are executed on another (the server).
RPCs of one type or another are nothing new. There have been different implementations available on a wide variety of platforms for some time. Perhaps the best known is the RPC implementation that was available on UNIX machines. This implementation used a complicated suite of functions that enabled information to be exchanged between the client and server, converting a basic C structure into a standardized format that could be broadcast over a network -- the External Data Representation (XDR) format. This handled the serialization of data and the normalization of the data into a format that could later be decoded by another client or server in the RPC architecture.
More recently, the explosion in the Web has meant that we are naturally, every time we visit a Web site, performing a remote procedure call. Our client, a browser, requests a file from a server -- Apache, IIS, and the like -- and then processes and displays the information. This is a simple data exchange. With dynamic elements like Common Gateway Interface (CGI), Java Server Pages (JSP), or Active Server Pages (ASP), we really are calling a remote procedure. The exchange format takes the form of an HTTP request and an HTML response, but the effect is the same: we call a procedure on a remote machine and get a response.
By standardizing that exchange of information in some way we arrive at Web services. The request and the response are now encoded in XML. There are two derivatives of basically the same technology: XML-RPC is designed to do exactly what the acronym suggests -- send and receive an XML formatted remote procedure call. Simple Object Access Protocol (SOAP) is more advanced. At its heart, SOAP is still an RPC technology, but it has been enhanced to allow the remote manipulation of an object. That makes it more than just a one-hit wonder in the RPC stakes as we can create an object, manipulate it, and use it to exchange more specific and formatted information between the server and the client.
Web services can be hosted by any Web server and can be written with pretty much any language or supporting platform, including Perl, Python, C/C++, Java language, and Visual Basic.. The core of a Web service is basically a dynamic component on the Web server that is capable of processing the Web service request and responding appropriately.
This means that in many cases you can build a Web service interface to your existing system very easily. It's just a case of writing the wrapper that goes around the outside of the normal call to the system that you would normally make.
Blurring the lines between grid and Web services
So far we have examined a grid technology that works by exchanging information, either between servers and clients or directly between clients, to process and distribute information. But that exchange system is going to need some way of actually communicating the information. Over the years a number of systems have been used, including the FTP protocol and custom built protocol systems.
Meanwhile, in the Web services camp, we have a generic tool that can be used to exchange information between two machines, be that a request to execute a particular function (such as getnewworkunit()) or simply to exchange information between the two.
Because Web services are based on other standards like XML, they are very easy to develop and extend into a wide variety of environments, and they are also easy to deploy. We get rid of all of the problems of exchanging data between differing systems, and we don't need to worry about the endian-ness (the order of bits in a byte) of the processor, or how to convert the information we are sending into a neutral format because the Web service standards take care of that.
Because you're going to need some kind of listener/dispatch service to process requests, distribute work, and collect it back, a Web service is ideal to use.
The major benefit of the Web services system is that, because it relies on the HTTP protocol, it is very easy to integrate Web services into your existing HTTP platforms, routing, firewalls, and other systems. Most organizations are already running an HTTP service, so you can support your grid system using existing technology and security systems without compromising your network or limiting the facilities of your grid system.
Developing a grid system that uses Web services therefore has a number of distinct benefits, including:
- Increased compatibility
- Increased flexibility
- Cross-platform-enabled development by eliminating the complexities of exchanging data
- Easy deployment using an existing Web server
- Easily secured by using existing HTTP security and firewall support
- Easy communication and accessibility by making it simple to contact the grid components over an Intranet or Internet
For all these reasons, and many more, Web services have become part of the new grid services standard, the Open Grid Services Architecture (OGSA), and its companion, the Open Grid Services Infrastructure (OGSI).
The Globus Toolkit 3.0, the first grid platform to fully support the OGSA/OGSI standard, supports Web services as a data exchange platform. IBM, a key proponent of the OGSA standard and the Globus system, is a strong supporter of Web services and is now recommending them for use across the business development platform. Globus supports the SOAP Web services protocol.
There are additional benefits to the Web services approach. Web services can publish their existence through a number of different Web services directories and systems, including systems like Universal Description, Discovery and Integration (UDDI) and Web Services Description Language (WSDL). For Grid computing to be easy to deploy, we need publication through such directories and systems.
Whether you choose to use the Globus Toolkit or not, you will need to think about how you are going to use Web services within your grid environment. Two Web services architectures can be used, which in turn fit into typical grid services structures: the request architecture, where clients contact one or more central servers, and the dispatch architecture, where servers contact clients directly. Each has a few issues that you need to consider before using Web services in your grid application, as discussed in the next sections.
Supporting a request architecture
The main location for using Web services is at the point of dispatch and brokering -- that is, the point units are distributed to clients (providers) within your grid. This is an example of a request architecture where clients request work from the grid broker.
The request architecture is actually the easiest system to support through Web services because the system continues to work pretty much as it does already: the client sends completed work units and requests new ones from one of the available servers. All you need to do is install the Web service on a Web server -- either separately or directly on your broker server -- and then add the code to interface your Web service to your broker. The whole system will work pretty much like the diagram in Figure 1:
Figure 1. The request architecture in action
Globus is an ideal solution for this architecture as the Web services component provides a convenient way of supporting both the client and the server parts of the system.
Supporting a dispatch architecture
The dispatch architecture reverses the traditional grid services model and distributes work directly from the server to the client. Although less common, it can be a practical way of distributing work in an environment where the work is controlled and carefully allotted and monitored into specific execution units. The server is then responsible for managing and distributing units on an individual basis.
The dispatch model is a good way of distributing work that is time-critical because units are dispatched to individual machines according to their load and the server queue on the broker.
It is particularly useful in Intranets and closed networks where the easy access and communications makes the system relatively efficient. It also operates well in situations where the work providers (clients) are dedicated to processing work for the grid.
The only problem with the dispatch system is that it is slightly more complicated to implement. Instead of the server hosting the Web services system and the clients acting as Web services clients, the model is turned on its head. The grid providers (traditionally the "clients") instead need to be able to support a Web services server interface. Meanwhile, the broker (traditionally the "server") becomes a Web services client of the grid providers.
You can see the model in action in Figure 2:
Figure 2. Web services in dispatch mode
A number of issues exist beyond the basics of the Web services mechanics here:
- The broker needs to know exactly which machines are part of the grid because it needs to be able to contact them individually.
- Each client must support the Web services model, which in turn relies on it supporting an HTTP service.
- Each client must be able to determine its own load and performance so that it can provide the information to the broker when it asks.
Even with each of these issues to contend with, the dispatch architecture is still relatively simple to deal with. However, the Globus Toolkit doesn't currently support this model. This doesn't mean that we can't use the Globus Toolkit for other areas, but it does mean that you want to reconsider how work is exchanged between the clients and the broker.
Summary
Integrating grid applications and Web services is not actually as complicated as it first seems. The basics of most grid applications lend themselves quite easily to the Web services architecture, but you need to consider the impacts on your grid application design to ensure that your backend systems (the broker, work unit manager, and other components) are compatible with the way you expect your clients to work.
Resources
About the author  | |  | Martin C. Brown is a former IT Director with experience in cross-platform integration. A keen developer, he has produced dynamic sites for blue-chip customers, and is the Technical Director of Foodware.net. Now a freelance writer and consultant, MC, as he is better known, works closely with Microsoft as an SME, is the LAMP Technologies Editor for LinuxWorld magazine, a core member of the AnswerSquad.com team, and has written books such as XML Processing with Perl, Python and PHP and the Microsoft IIS 6 Delta Guide. MC can be contacted at questions@mcslp.com. |
Rate this page
|