Introduction to categorization
Businesses registered into the IBM version of the Universal Description, Discovery and Integration (UDDI) Business Registry often contain massive amounts of readable information: business names (in every language imaginable), addresses, contacts, including all their e-mail addresses, phone, cell, and fax numbers. Having described their businesses, these service providers then add every service they offer. With all that information, it should be possible for companies to just sit back and watch the customers knock down their doors, flood their e-mail inboxes and watch their online orders just roll right in, right?
Not quite, it's still necessary for these companies to take advantage of one of the most important features UDDI provides: Categorization!
With all of the information companies typically enter, customers can only find them by already knowing the name of the business or service using a "Find by name" search. So the only people who are likely to find these businesses are those that still use the paper White and Yellow Pages in the phone book. It should be clear that in a global business directory, having a business name starting with "AAA" won't be sufficient for a business to be listed first or found most often.
UDDI provides a mechanism to include standard taxonomies that can be used to describe each entry using as many industry standard search terms as needed. Each business, service, or technical model can contain a "Category Bag" which holds keyed references (that is, categorization codes, locators, or keywords) that can specifically describe its type of business, physical location, and even the exact products and services it offers. These keyed references contain a reference to the classification system or taxonomy, a text field containing the value within that taxonomy and a text field for a human readable description. Using this method of categorization, the UDDI Inquiry API can quickly and efficiently connect businesses and services to exactly the customers that need them.
Using categorization effectively
UDDI currently supports several "built-in" categorization systems (see Resources for a link) that can be used in publication or inquiry:
- North American Industrial Classification System (NAICS)
- Business classification codes, for example, Pharmaceutical Manufacturing (3254), Optical Goods Stores (44613)
- Universal Standard Products and Services Classification (UNSPSC)
- Product and Service codes, for example, Ultra Light Aircraft (25.13.20.05.00), Soil Pollution Advisory Services (77.12.16.04.00)
- Geographic Classification System (GCS) (based upon ISO 3166-1999)
- Country, State, Province, and Region codes, for example, United States-Texas (US-TX), Denmark (DK)
- UDDI classifications
- A classification of UDDI standards and standards UDDI uses/recognizes, for example, Wire/Transport Protocol (transport), Web service described in WSDL (wsdlSpec)
- General keywords
- Free-form association of any value to a keyword, for example, "Store Location" (#102), "Automobile" (Toyota)
The first three of these taxonomies are standards for categorizing entries. The fourth taxonomy, UDDI classifications, is a taxonomy that was developed as part of the UDDI specification to provide useful values for categorizing the technical information of Web services. The last taxonomy is useful for associating keywords with an entry, especially those that are not part of the name of the entry. Each of these category systems is uniquely identified by a UDDI entry called a tModel (Technical Model) and can be referenced using its tModelKey. References to the above category information in the UDDI data structures would appear in XML as shown in Listing 1.
Listing 1. UDDI data structures in XML
<categoryBag> <keyedReference tModelKey="UUID:C0B9FE13-179F-413D-8A5B-5004DB8E5BB2" keyValue="3254" keyName=" Pharmaceutical Manufacturing " /> <keyedReference tModelKey="UUID:CD153257-086A-4237-B336-6BDCBDCC6634" keyValue="25.13.20.05.00" keyName="Ultra Light Aircraft" /> <keyedReference tModelKey="UUID:4E49A8D6-D5A2-4FC2-93A0-0411D8D19E88" keyValue="US-TX" keyName="Texas" /> <keyedReference tModelKey="UUID:C1ACF26D-9672-4404-9D70-39B756E62AB4" keyValue="wsdlSpec" keyName="Specification for a Web Service described in WSDL" /> <keyedReference tModelKey="aaa" keyValue="102" keyName="Store Location" /> </categoryBag> |
Besides these "built-in" value sets, most areas of governments and business (for example, agriculture, automotive, computing) develop their own standards and classification systems which could easily be published in, described, and referenced within UDDI. For more information and code samples regarding publishing a new value set, see the UDDI4J section of the Web Services Publication and Discovery (see Resources for a link).
To accommodate other value sets, UDDI also provides the means for external providers of categorization, classification, and taxonomies to register their value sets to UDDI itself thereby enabling any business, service, or technical model to begin associating their values into their Category Bags.
UDDI further provides a mechanism for value set to integrate with a UDDI registry to verify that businesses or services are associating and registering their values accurately. The provider can deploy a validation service that UDDI can use before codes from their value set are published.
Specification for an externally checked category system
When a reference to a checked taxonomy with an external validation service is received by a UDDI registry, it will issue a validate_values message to the appropriate external service. This allows a given category system's use to be regulated by an external party. The normal use is to verify that specific category values (checking the keyValue attribute values supplied) exist within the given taxonomy. For certain categorizations and identifiers, the party providing the validation service may further restrict the use of a value to certain parties based on the identifiers passed in the message or any other type of contextual check that is possible using the passed data (see Resources for a link to the Programmer's API).
The UDDI registry implementation calling validate_values will pass a businessEntity, a businessService, or a tModel element as the sole argument to this call-out. This is the same data that is being passed within a save_business, save_service, or save_tModel API call. Multiple elements of the same type may be passed together.
From the UDDI V2 schema (see Resources for a link), the validate_values API syntax inside the SOAP:Body is as shown in Listing 2.
Listing 2. validate_values API syntax inside the SOAP:Body
<complexType name="validate_values">
<choice>
<element ref="uddi:businessEntity" minOccurs="0" maxOccurs="unbounded" />
<element ref="uddi:businessService" minOccurs="0" maxOccurs="unbounded" />
<element ref="uddi:tModel" minOccurs="0" maxOccurs="unbounded" />
</choice>
<attribute name="generic" type="string" use="required" />
</complexType>
|
Speed-start validation service
The Speed-start validation service helps ensure that entities published to the IBM UDDI Test Registry contain accessible Web services. The IBM UDDI Test Registry invokes the Speed-start validation service every time it receives a request to publish an entity containing a category bag with a reference to the developerWorks categorization tModel. The validation service will then check that all SOAP endpoints and WSDL documents provided in an entity are accessible on the World Wide Web.
The Speed-start validation service is an implementation of the UDDI V2 validate_values API described in the UDDI V2 specifications.
Before a publication request containing a reference to the developerWorks category system can succeed, the entities in the request must first pass validation by the Speed-start validation service. The key for the developerWorks category system is built into IBM Web service tooling, or can be found by issuing a find_tModel request for category systems on the IBM UDDI Test Registry. A reference to the developerWorks category system would appear as the following XML fragment (see Listing 3) in an entity:
Listing 3. Reference to the developerWorks category system
<categoryBag> ... <keyedReference tModelKey="UUID:8F497C50-EB05-11D6-B618-000629DC0A53" keyValue="Speed Start" keyName="Web service information for the developerWorks Speed Start community" /> ... /categoryBag> |
The test registry invokes the Speed-start validation service by posting a SOAP message containing a validate_values request. Those requests contain a list of businessEntity, businessService, or tModel entities. All entities containing a categorization to the Speed-start tModel are validated by the validation service.
An entity is considered valid if all URLs contained in the entity are accessible on the World Wide Web. If a URL is determined to be invalid, then the validation service sends a response back to the test registry containing an error code and message. As a result, the UDDI publish request fails and returns the error code and message from validation to the publisher. Otherwise if validation succeeds, then a success response is sent to the test registry, and the UDDI publish request can succeed.
Validation service implementation
The Speed-start validation service is implemented as a Web service. A Web service is a self-describing, self-contained, modular unit of application logic that provides some functionality to other applications through an Internet connection. Applications access Web services via Web protocols and data formats, such as HTTP and XML, with no need to worry about how each Web service is implemented.
The Validation service implements the UDDI V2 validate_values API. This implementation uses a Java servlet to handle validate_values requests. A servlet is a Java program that runs within a Web server such as WebSphere Application Server. Servlets receive and respond to requests from Web clients, usually across HTTP.
There are several means to implement the a service offering the validate_values API. Using the WSDL2Java code generate from AXIS, the datatypes from the UDDI schema and an interface that can be used as the basis for implementing a concrete validate_values API are generated (see Resources for a link to information about WSDL2Java). Depending on the naming parameters passed to the code generator, the generated validate values code should resemble the following code in Listing 4.
Listing 4. Generated validate values code
// An interface will be created, for which a concrete implementation can be written:
public interface UDDI_ValueSetValidation_PortType extends java.rmi.Remote
{
public uddi.DispositionReport validate_Values(uddi.Validate_Values body);
}
|
The benefit of using this tooling is that the data types referenced by the interface are all created as Java classes as part of the code generation. At runtime, these data classes are instantiated and populated by AXIS every time the validate_values API is invoked. As an example of the Java data structures, here is the uddi.vs.Validate_Values object without all of the get/set methods in Listing 5.
Listing 5. Partial uddi.vs.Validate_Values object
public class Validate_Values implements java.io.Serializable {
private uddi.AuthInfo authInfo;
private uddi.BusinessEntity[] businessEntity;
private uddi.BusinessService[] businessService;
private uddi.BindingTemplate[] bindingTemplate;
private uddi.TModel[] tModel;
private uddi.PublisherAssertion[] publisherAssertion;
...
|
Using the WSDL2Java tool allows for rapid development of a service without explicit handling of SOAP, HTTP or datatype mapping. The concrete implementation of the validate_values service would step through each of the datatype arrays and search for the data that is to be validated in the context of the top level entities, businessEntity, businessService, bindingTemplate, tModel or publisherAssertion.
Another approach to implementing a validation service without using a full SOAP engine is to create a Java class which extends the abstract class HttpServlet from J2EE. Since UDDI APIs need only respond to HTTP Post requests, the Validation service only implements the doPost method. Listing 6 shows some sample code to demonstrate this.
Listing 6. Sample code for an alternative approach
public class ValidatorService extends HttpServlet
implements DefaultHandler
{
protected boolean insideOverviewURL;
protected StringBuffer overviewURL;
protected void doPost(HttpServletRequest req,
HttpServletResponse resp)
throws ServletException, java.io.IOException
{
// Get the input stream
InputStream input = req.getInputStream();
// Read the XML request from the input stream
// Parse the XML request using SAX and a registered listener
// to examine specific elements, or use a DOM parser
// to create a full document
boolean validURLs = true;
try
{
SAXParser parser = new SAXParser();
parser.setContentHandler(this);
InputSource is = new InputSource( input );
parser.parse( is );
// Perform some contextual validation
// Get the output stream
}
catch (SAXException e)
{
validURLs = false;
}
OutputStream output = resp.getOutputStream();
output.setContentType("text/xml;charset=UTF-8");
// Print the response to the output stream
if (validURLs)
// Print the SOAP message with E_success report
output.write(UTF8_SUCCESS_RESPONSE_BYTES);
else
// Write the SOAP message with E_invalid value
output.write(UTF8_INVALID_RESPONSE_BYTES);
}
...
// DefaultHandler methods for SAX parsing
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException
{
if (localName.equals("overviewURL"))
{
insideOverviewURL = true;
overviewURL = new StringBuffer();
}
}
public void characters(char[] ch, int start, int length)
throws SAXException
{
if (insideOverviewURL)
{
overviewURL.append(ch, start, length);
}
}
public void endElement(String uri, String localName, String qName)
throws SAXException
{
if (insideOverviewURL
{
insideOverviewURL = false;
// Throw an exception that will cause parsing to stop
// on any inaccessible overviewURL
if (!validateURL(overviewURL.toString()))
throw new SAXException("Invalid overviewURL");
}
}
}
|
The service uses Xerces to parse the XML request and instantiate a Java object representation of the relevant parts of the request to verify. In the above example only entities with the XML element local name "overviewURL" are examined. In the full validation service the whole message is searched for UDDI entities containing a categorization to the Speed-start tModel and all URLs are verified. The following code snippet in Listing 7 demonstrates how the URLs are validated as Internet accessible.
Listing 7. How URLs are validated
private boolean validateURL( String urlString )
{
boolean valid = true;
try
{
URL url = new URL( urlString );
URLConnection connection = url.openConnection();
connection.connect();
}
catch ( MalformedURLException exception )
{
valid=false;
}
catch ( IOException exception )
{
valid=false;
}
return valid;
}
|
Since this implementation is invoked every time an entity includes a reference to the developerWorks taxonomy, an inquiry for services containing a reference to that taxonomy should yield a subset of the registry data which is part of the Internet accessible Speed-start community.
Using WebSphere Studio Web Services Explorer to publish to and query from the Speed-start community
Publishing Web services to the Speed-start community can be done using any UDDI aware tooling that allows the developerWorks category system to be added as a property of a service. The Web Services Explorer in WebSphere Studio is pre-configured to include the developerWorks category system. From the perspective of the publisher or the inquirer, differentiating data using externally checked or validated category systems is the same as using any other category system in UDDI.
By example, consider the Magic 8 ball Web service highlighted in the "Ask the magic eight ball" article on developerWorks (see Resources). This Web service is hosted on the Internet as a demonstration and is described using WSDL at the following URL: http://dwdemos.alphaworks.ibm.com/axis/services/urn:EightBall?wsdl
To register this Web service as part of the Speed-start community, it is only necessary to know the URL for the WSDL file and that the WSDL file and service endpoint are already hosted on the Internet.
Publishing using the Web Services Explorer is accessed from the File => Export... menu in WebSphere Studio. In the first page of the dialog that appears, select Web Service as the export destination. On the second page, select the IBM test registry as shown in the Figure 1.
Figure 1. Select the IBM test registry

The next step is to create the information about the business (or provider) that is offering the service by selecting the publish button in the Actions section of the screen, as shown in Figure 2.
Figure 2. Select the publish button

On the page that appears, fill in the user account information (an account can be obtained at https://uddi.ibm.com/testregistry/registry.html) and a name and description of the business or service provider.
The next and final step is to provide the service information and add the category entry indicating that this should be validated as part of the Speed-start community. This is done by selecting the publish service button in the Actions section, as shown in Figure 3.
Figure 3. Select publish service

When the form appears, select the Advanced option, provide a name and the URL of the WSDL file as shown in Figure 4.
Figure 4. Select the advanced option

To indicate the service is to be part of the Speed-start community, add the appropriate categorization by adding a category, selecting dwCommunity from the type drop down box and clicking Browse... (see Figure 5) and selecting Speed-start from the taxonomy browsing window. After this information has been entered, click Go and the service will be published after it has been verified by the Speed-start validation service to be Internet accessible.
Figure 5. Add service to the Speed-start community

To use the explorer to find all of the services that have been validated by the Speed-start validation service, select the find action, as shown in
Figure 6.
Figure 6. Find all services that have been validated

When the find page appears, select Services from the Search for box, select Advanced as the type of search, add the category value for Speed-start as shown in the publication step and click Go...
The results on the left hand side will be a list of Web services submitted as part of the Speed-start community, as shown in Figure 7.
Figure 7. List of Web services

This article has provided a general overview of categorization and how it can be used in conjunction with validation services called by UDDI registries to provide a community or screened set of results according to category system specific criteria. The example of the Speed-start community is a simple example of the power of contextual validation services that can greatly enhance the quality of results corresponding to queries for data referencing a particular category or identifier system. Using the information in this article, it should be possible to develop services that greatly enhance the results from UDDI registries such as service quality or reference services.
- Get everything you need to create and deploy Web services from Speed-start Web services on developerWorks.
- Find more information about "built-in" categorization systems.
- Read the UDDI4J section of the Web Services Publication and Discovery (PDF).
- Get more information about the Programmer's API.
- Download the UDDI V2 schema.
- See Axis Reference Guide for details on WSDL2Java.
- Read the Doug Tidwell's Ask the magic eight ball, (developerWorks, January 2003).
Matt Rutkowski works in IBM's Emerging Technologies Group and is currently working on the Universal Description, Discovery and Integration (UDDI) Version 3 implementation for IBM. He is an aficionado of computer graphics and rendering and proud alumni of Ohio State (home of the 2002 National College Football Championship Buckeyes). Feel free to e-mail Matt at mrutkows@us.ibm.com.
Andrew Hately works in IBM's Emerging Technologies Group and is currently working on the implementation of UDDI Version 3 and the IBM UDDI Business Registry (http://uddi.ibm.com/). He also represents IBM in the OASIS UDDI Specification Technical Committee. You can contact Andrew at hately@us.ibm.com.
Robert Chumbley works in IBM's Emerging Technologies Group and is currently working on the Self-Voicing Development Kit (SVDK). He has spent the past couple years working on the development of the IBM UDDI Business Registry. Prior to that, he graduated from THE university in Texas, also known as Texas A&M. You can contact Robert at chumbley@us.ibm.com.



