Solving the problems associated with service discovery and service description retrieval is key to the success of Web services. While not absolutely necessary for small-scale or tightly integrated environments, the solutions become of paramount importance if more complex integration models, such as the Service-Oriented Architecture [1] are to be successfully deployed.
The concept of discovery can be interpreted in many ways. Patterns of discovery fall into two general categories: focused and unfocused. "Focused" discovery occurs as a result of an active search either on a specific target, for a specific piece of information, or as a combination of both. This may mean, for example, that a known party is to be queried as to what data it has to offer or that a search is to be made for a provider which is able to produce a set of data about which characteristics are known. "Unfocused" discovery occurs either as the result of receiving unsolicited information, or through traversing a potentially open-ended space in order to retrieve any available data which may be present.
In addition to the patterns which they support, discovery mechanisms may be characterized according to two other attributes: by the choice of the point of dissemination of information, and by the costs associated with the discovery process. During discovery, information may be extracted either directly from the source/originator of the information or from a third-party. Retrieving information directly from the source increases the likelihood of the information being accurate, while fetching it indirectly gives the opportunity to utilize additional services provided by the third-parties and doesn't require that the original source always be available or easily locatable. The costs associated with the discovery techniques also vary. Certain environments have a much higher cost associated with storing and presenting the information than do others.
From the attributes listed above, a taxonomy can be constructed through which discovery techniques and their supporting mechanisms can be compared and contrasted. To summarize, discovery mechanisms can be categorized according to the following criteria:
- Which discovery patterns are supported? Does the mechanism support only focused discovery patterns, only unfocused discovery patterns, or combination of both? If it is a combination, to what extent is each category of patterns supported?
- Where is the point of dissemination? Is the information generally discovered directly from the source/originator or through a third-party?
- What is the overhead associated with the discovery mechanism? Is there a basic cost associated with storing and presenting the information for discovery, and if so, how significant is it? Note: This is the cost to the information/infrastructure provider for supporting the discovery mechanism; the differences in the cost of consuming the information can be assumed to be negligible.
In Section 2 of this paper we describe several mechanisms used for personal information discovery, and classify them according to the above taxonomy. Section 3 applies the same taxonomy to the Web services space, and describes where the UDDI and WS-Inspection mechanisms fit.
The location of personal information, in the form of phone numbers and addresses, provides an interesting example of a set of related discovery processes. The very simplest system is verbal communication; someone whom we meet may simply tell us their phone number and address directly. The discovered data generally comes directly from the owner/originator of the information, and there is almost no overhead associated with this mechanism. In this type of discovery, portions of both of the focused and unfocused usage patterns are enabled. In the focused case, for example, we may ask someone for their telephone number, however, we do not as often ask someone to identify the person to whom a phone number belongs. Similarly, in the unfocused case, someone may offer their telephone number to us without having been asked to provide the information, but we would not generally walk around asking a random set of people for their telephone numbers.
The next higher level of personal information discovery involves the exchange of a structured piece of information, like a business card or information pamphlet. As is the case with the verbal exchange, portions of both of the focused and unfocused usage patterns are facilitated, and information generally passes directly from the owner/originator to the requester. The card or pamphlet provides an aggregation of related information, but does not specify how the information itself should be formatted; the formatting is dependent upon the particular piece of information which is to be conveyed. The main difference between this form of discovery and the previous one lies with this aggregation; it provides increased functionality by allowing more information to be passed within a single exchange than would be otherwise possible. The additional semantic information provided by the grouping may also not have been captured within each individual piece of data. The benefits provided by the aggregation do come at a price; there is additional overhead associated with creating and maintaining the aggregating documents.
At the highest level of personal information discovery sits the directory assistance system or searchable on-line "Yellow Pages." Directory assistance provides high-level query and information retrieval functions, and supports a much broader set of focused discovery patterns than the other systems do. In order to provide the increased focused discovery support, these systems aggregate information across multiple sources instead of just for a single source. This makes operator and directory assistance systems more costly than the previous two mechanisms, as does the fact that the directory services are generally provided by a third-party to the information. Despite the increased overhead, directory assistance systems provide a set of high-value functionality which is not easily accomplished within the confines of verbal or simple aggregate environments.
To summarize, the personal information discovery mechanisms that have been described possess the following characteristics:
- Direct communication (voice)
- Supports some focused and unfocused discovery patterns.
- Dissemination is direct from the source/originator.
- No overhead.
- Simple aggregate token (business card)
- Supports some focused and unfocused discovery patterns.
- Dissemination is direct from the source/originator.
- Moderate overhead.
- Directory assistance (operator)
- Supports a significant number of focused discovery patterns and some unfocused ones.
- Dissemination is via a third-party.
- High overhead.
The classification presented above is by no means exhaustive, but it does serve to provide a useful example against which to relate the mechanisms which are involved with Web service discovery.
Web service information discovery is very similar to personal information discovery, with one or more WSDL or other description documents being used to provide "contact" (that is, endpoint) information in lieu of telephone numbers or physical addresses. The following two sections provide a brief description of how the UDDI and WS-Inspection mechanisms fit into the taxonomy described in the introduction in terms of the characteristics which they exhibit. The final section describes how the two may be applied, depending upon the desired functionality and operating limitations.
The Universal Description Discovery and Integration (UDDI) specification [2] addresses the problems associated with Web service discovery through the use of a centralized model; one or more repositories are created to house information about businesses and the services which they offer. Requests and updates pertaining to the service and business related information are issued directly against the repositories. In addition, UDDI prescribes a specific format for a portion of the stored description information and, to facilitate advanced searching, assumes that other description information will be stored/registered within the system as well.
In terms of the personal information discovery context which was described above, the UDDI environment most closely resembles that of a directory assistance provider or searchable on-line "Yellow Pages" system. Like directory assistance and other third-party providers, UDDI systems are based upon organized repositories which provide many high-level functions, including advanced searching capabilities. This makes them very adept in facilitating focused discovery patterns, including helping requesters quickly locate potential communication partners. To a lesser extent, UDDI is also able to facilitate some patterns of unfocused discovery through browsing of the repository. In order to provide advanced functionality, however, UDDI requires that a certain amount of infrastructure be deployed and maintained, thus increasing the cost of operation. In addition, unless the service descriptions are stored only within UDDI, there is a cost associated with keeping the different versions synchronized. Depending upon the business model adopted by the directory provider, these additional costs may or may not directly impact the owner of the information. For instance, the model adopted by the Universal Business Registry shields the information owner from these costs.
3.2 WS-Inspection characteristics
The Web Services Inspection Language (WS-Inspection) [3] relies upon a completely distributed model for providing service-related information; the service descriptions may be stored at any location, and requests to retrieve the information are generally made directly to the entities which are offering the services. The WS-Inspection specification does not stipulate any particular format for the service information; it relies upon other standards, including UDDI, to define the description formats. The WS-Inspection specification also relies upon existing Web technologies and infrastructure to provide mechanisms for publishing and retrieving its documents.
In terms of the personal information discovery context described in the introduction, the WS-Inspection mechanism most closely resembles business cards and other simple information aggregation documents. As is the case with those other mechanisms, WS-Inspection documents are very light-weight, easy to construct, and easy to maintain. By providing the ability to disseminate service related information through existing protocols directly from the point at which the service is being offered, the WS-Inspection mechanism enables focused discovery to be performed on a single target. Due to its decentralized nature however, the WS-Inspection specification does not provide for a good mechanism upon which to execute focused discovery if the communication partner is unknown. Unlike with business cards and pamphlets, the WS-Inspection specification supports a significant number of unfocused discovery patterns by providing a set of conventions to make its documents easily locatable and to allow them to be presented in a proactive fashion by the service provider. As is the case with other simple aggregation supporting discovery mechanisms, there is a small cost associated with creating and maintaining the aggregations.
Before the picture can be completed, a third mechanism which is analogous to verbal communication from the personal information discovery space must be mentioned. This mechanism involves the direct retrieval of description documents, WSDL and other related files, from their source. As is the case with the verbal scenario, this mechanism does not really have any overhead associated with it, but it does only support portions of the focused and unfocused discovery patterns.
In summary, the Web service discovery mechanisms that have been described possess the following characteristics:
- Direct description retrieval (voice, FTP, HTTP GET)
- Supports some focused and unfocused discovery patterns.
- Dissemination is direct from the source/originator.
- No overhead.
- Simple aggregate publishing (WS-Inspection)
- Supports some focused discovery patterns and a significant number of unfocused ones.
- Dissemination is direct from the source/originator.
- Moderate overhead.
- Advanced directory (UDDI)
- Supports a significant number of focused discovery patterns and a some unfocused ones.
- Dissemination is via a third-party.
- Moderate overhead for the information owner; high overhead for the directory provider.
Like the business cards and the directory assistance systems, the UDDI and WS-Inspection specifications address different sets of issues in the discovery problem space, and are characterized by different sets of trade-offs. UDDI provides a high degree of functionality, but it comes at a cost of increased complexity. The WS-Inspection specification adopts a lower level of functionality in order to maintain a low overhead. In this light, the two specifications should be viewed as complementary technologies, to be used either together or separately depending upon the situation. For example, a UDDI repository could be populated based upon the results found when performing a "Web crawl" for WS-Inspection documents. Likewise, a UDDI repository may itself be discovered when a requester retrieves a WS-Inspection document which references an entry in the repository. In environments where the advanced functionality afforded by UDDI is not required and where constraints do not allow for its deployment, the WS-Inspection mechanism may provide all of the capabilities which are needed. In situations where data needs to be centrally managed, a UDDI solution alone may provide the best fit. The UDDI and WS-Inspection specifications should not be viewed as providing competing mechanisms, any more than business cards and directory assistance services are viewed as competing to disseminate personal information.
- [1] IBM Web Services Architecture Team, The next stage of evolution for e-business; Web Services architecture overview, September 2000.
- [2] UDDI Project, "UDDI Technical White Paper," September 2000, Available at http://www.uddi.org
- [3] K. Ballinger, P. Brittenham, A. Malhotra, W. Nagy, S. Pharies, Web Services Inspection Language (WS-Inspection) 1.0, October 2001.
- The Web Services ToolKit implements WS-Inspection from version 2.4.1.