Contents


Service Registry with Advanced Search Capability

Part 2: Implementation

Learn how to implement four core components

Comments

Content series:

This content is part # of # in the series: Service Registry with Advanced Search Capability

Stay tuned for additional content in this series.

This content is part of the series:Service Registry with Advanced Search Capability

Stay tuned for additional content in this series.

In Service-Oriented Architecture (SOA), a service registry is used to publish and discover services. In classic registry one needs to exactly match the name of the service with the name that exists in the registry in order to discover the service. In part 1, you learned many reasons why this restriction of exactly matching names must be removed and therefore there is a need for a registry with advanced search capability so that one does not need to know the exact name of the service in order to find the service in the registry. In addition, we learned the process and components needed to provide such an advanced search capability. The four components that we identified in part 1 were Classic Registry, Name Parser, Name Composer, and Dictionary.

In this second installment you will learn how to implement the four components identified in part 1. These implementations are described next.

Classic Registry

The two most important functions that classic registry provide are publication of services and search/discovery of the published services. In this series we are only interested in the search/discovery feature of the registry. Almost all commercially available registries, including WebSphere Registry and Repository (WSRR), provide some form of search/discovery functionality which allows users of the registry to find service description, which includes service interface and implementation, if the user supplies the exact name or id of the service. We call the registry with this limited search/discovery feature as the classic registry. This functionality of the classic registry is summarized pictorially in Figure 1. Since all of the commercially available registries provide this functionality, we will not discuss its implementation in this article. Instead our focus would be on how one can build on the top of this functionality to provide a registry with advanced search capability.

Figure 1. Search Functionality of a classic registry
Search Function of a Classic registry
Search Function of a Classic registry

Name Parser

The name parser component takes as input the name of service, which is usually composed of several words, and returns a list of words that constitute that name. For example, if the input name is GetCarPrice, the output would be a list containing the words get, car, and price. This functionality of Name Parser component is shown schematically in Figure 2.

Figure 2. The functionality of the Name Parser Component
Search Function of a Classic registry
Search Function of a Classic registry

The implementation of the Name Parser component exploits the conventions used in naming a service. For example one common convention for naming service uses upper case letters to separate/distinguish words in the name of the service. An example of such service name is GetCarPrice. In this name, the three words, get, car, and price, which constitute the service name, are separated by three upper case letters, G, C, and P. This naming convention follows very closely the naming convention in Java programming language. Exploiting such naming conventions it is easy to parse the service name. For example, the following Listing 1 is a sample code which will parse the service name which follows Java naming convention and return the constituent words as a list (Array).

Listing 1. Java code for parsing the name of a service
 public Class NameParser
  {
	public static ArrayList parseName ( String serviceName) 
	{
		String name = serviceName;
		int len = name.length();
		int l = 0;
		ArrayList list = new ArrayList ();
		String constituentWord = "";
		char c = null;	
		for ( int k = 0; k < len;  k++)
		{
			c = name.charAt (k);

			if ( c < 'A'  || k == 0 )
			{
				constituentWord = constituentWord + c;
			}
			else
			{
				list.add (l, constituentWord);
				l++;
				constituentWord = "" + c;
			}
		}
		
		list.add (l, constituentWord);

		return list;
	}
  }

Another common convention for names follows C/C++ programming language naming convention. In this convention, the words in the name of a service are separated by an underscore. A code very similar to the code shown in Listing 1 can be used to parse the names which uses underscores to separate words. In fact, it is also straightforward to generalize the above code so that the code can handle both of these naming conventions as well as other similar conventions.

Name Composer

The basic functionality provided by this Name Composer component is the reverse that of a name parser component. It takes as input a list of words and combines them in proper order to provide a possible service name. For example, the input could be a list containing the words get, car, and price. Then the output would be the possible service name of GetCarPrice. This basic functionality of the Name Composer is shown schematically in Figure 3. In reality the functionality provided by a Name Composer is little bit more than the basic functionality. Instead of taking a single list of words as the input, the Name Composer component takes as input several lists of words and returns a list of all possible service names.

Figure 3. The basic functionality of the Name Composer Component
Basic functionality of Name Composer
Basic functionality of Name Composer

The following Listing 2 shows example Java code which can be used to compose equivalent names of the service. The code shown in this listing can take up to three lists of words as input. Each list contains synonyms of one word in the original name of the service. The output of this code is a list as an array of equivalent names of the service. Although, this particular code is limited to a maximum of three words in the service name, it is clear that it can easily be generalized to a larger number of constituent words.

Listing 2. Java code for composing equivalent names of a service
public class NameComposer
{
	public ArrayList composeNames ( ArrayList list1, ArrayList list2, ArrayList list3)
	{
		String serviceName = null;
		String word1 = null;
		String word2 = null;
		String word3 = null;
		int size1 = 0;
		int size2 = 0;
		int size3 = 0;
		ArrayList serviceNames = new ArrayList (); 

		if ( list1 != null )
			size1 = list1.size ();

		if ( list2 != null )
			size2 = list2.size ();

		if ( list3 != null )
			size3 = list3.size ();


		for  (int i = 0; i < size1;  i++)
		{
			word1 = (String) list1.get(i);
			word1 = upperFirstChar (word1);
		
			if ( size2 == 0)
			{
				servicesNames.add(word1);
				continue;
			}

			for ( int j = 0;  j < size2;  j++)
			{
				word2 = (String) list2.get(j);
				word2 = upperFirstChar (word2);
				if ( size3 == 0)
				{
					serviceNames.add (word1 + word2);
					continue;
				}
				for ( int k = 0;  k < size3;  k++)
				{
					word3 = (String) list3.get(k);
					word3 = upperFirstChar (word3);
					serviceName = word1 + word2 + word3;
					serviceNames.add (serviceName);
				} 
			}
		}

		return serviceNames;
	}

	private String upperFirstChar (String word)
	{
		char c = word.charAt (0);
		c = char.toUpperCase (c);
		StringBuffer sbuff = new StringBuffer (word);
		sbuff.setCharAt (0, c);
		return sbuff.toString ();
	}
}

Dictionary

Dictionary, in the present context, is a component which takes as input a word. The output of the dictionary is list of words which have same or similar meaning to the original word. As an example, if the input is the word get, then the output would be a list containing the words get, fetch, and obtain. This functionality of a dictionary component is shown schematically in Figure 4.

Figure 4. Functionality of Dictionary component
Functionality of Dictionary Component
Functionality of Dictionary Component

We can categorize dictionary component according to the contents of the dictionary.

  • a restricted dictionary which is specific to an organization , like a large corporation or enterprise
  • a more general industry specific dictionary which caters to a specific industry type e.g. automobile dealers
  • a very general dictionary which can be used across the board by any organization.

There are two major ways of implementing a dictionary component. (1) a relational database based dictionary and (2) a file based dictionary. We now describe each of these two methods of implementing a dictionary component.

Relational Database Based Dictionary

A relational database stores information in tables. Each row in the table stores a set of related data and the data stored in a row is called a record. To implement a dictionary using a relational database for our system, each word and its synonyms would be stored in a row. For increased search performance, the first column is indexed. For example to store the word car and its two synonyms, automobile and vehicle, the database will have three rows as shown in Table 1. Each of these rows will contain the same set of synonyms but each row will start with a different synonym as shown in Table 1. It is important to note one limitation of the database dictionary. The maximum number of synonyms must be specified at the time of the creation of the table. As we will see later, file-based dictionary overcomes this limitation.

Table 1. Relational Database Table containing words and their synonyms
WordSynonym 1Synonym 2
CarAutomobileVehicle
AutomobileCarVehicle
VehicleAutomobileCar

The working of this type of dictionary is best explained by the flow diagram shown in Figure 5. As a first step, use sql or sql-based tool to create table with enough number of columns to hold the maximum number of synonyms of each word. Each column must have enough assigned storage to hold the largest synonym for each word. Then, use sql to enter synonyms of each word in the table. The number of rows entered for each word will be equal to the number of synonyms for that word. Each of these rows will start with a different synonym. On receiving a request from the client for the synonyms of a given word, formulate a database query using sql to search for the synonyms of a given word. After executing the query against the database, return the results of the query as list or null if no synonym is found.

Figure 5. Process flow for database based dictionary
Process flow for database based dictionary

File Based Dictionary

In this embodiment, the dictionary data is stored on the computer hard disk as a file. Each line in the file will contain a word and its synonyms separated by a space character or some other known token. Thus to store all the three synonyms of the word car, three lines would be used, each line starting with a different synonym as shown in Listing 3.

Listing 3. Example file containing words and their synonyms
Car Automobile Vehicle
Automobile Car Vehicle
Vehicle Automobile Car
...
...

The working of this type of dictionary is best explained by the flow diagram shown in Figure 6. As a first step, use a text editor to create a file containing each common word and its synonyms on a single line. The synonyms and the word would be separated by a known token. The file is stored on a computer hard disk. As a second step, read each line in the file into the computer memory as a character string at the start of the dictionary application or subprogram. Then, we parse the character string into individual words and store in the computer memory as list. Copy these lists into a map with each list entry identified by the first word (the key) of the list. On receiving a request for synonyms of a given word, search the map using the given word as the key. If a list is found that corresponds to the input word, the list is returned to the client, otherwise a null is returned.

Figure 6. Process flow for file based dictionary
Process flow for database based dictionary

Please note that in case of database dictionary, the maximum number of synonym is fixed at the time of the design of the database schema. No such restriction applies to the case of file-based dictionary. Any number of synonyms can be stored in the file. Also note that in case of database dictionary, the length of each column is fixed at the design time of database schema. This restricts the size of each synonym to this maximum size. No such restriction applies to the case of file based storage. Each synonym can be as long as is desired. Another difference is that in case of database design a tool/custom tool would be needed to put the synonyms in the database. File-based dictionary does not need such a tool. Any text editor can be used in the case of file-based dictionary.

Conclusion

In this installment we have described the implementations of the four core components of the SOA registry with advanced search capability. The advanced search capability allows for discovering a service development time and run time information in a registry even when the service exact name is not known. It is important to note that the concepts and the implementations of these core components are covered in IBM two pending patents. In the last installment, part 3 of this series, we will describe the various configurations that are possible for deploying these components as a single application or multiple applications running on a single server or multiple servers. In part 3, we will also describe the implementation of a fifth component, called Controller, which may be needed for some configurations. The Controller component will control how the four core components works together to provide the required advanced search capability.


Downloadable resources


Related topics


Comments

Sign in or register to add and subscribe to comments.

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=SOA and web services
ArticleID=463634
ArticleTitle=Service Registry with Advanced Search Capability: Part 2: Implementation
publish-date=01242010