 | Level: Intermediate Boris Lublinsky, Enterprise Architect, Freelance Consultant
15 May 2007 What a difference a few letters can make: Service repositories and service
registries may sound alike but each plays a very distinct role in an SOA
implementation. In this article, discover the differences between the two and why your
SOA should include both.
Introduction: The linchpin(s) of SOA implementation
Many recent publications [Longworth, Clement, Seeley]
define registry/repository functionality as a lynchpin of any Service-Oriented Architecture (SOA) implementation.
For example, David Longworth writes in his 2005
Loosely Coupled
article: "How
registry/repository is implemented can spell the difference between success and
failure in an enterprise SOA" [Longworth]. He defines a
registry/repository with the following capabilities:
-
Developer efficiency. It allows an organization to keep track of existing
services and service-related artifacts and consequently allows for reuse of existing
enterprise services-related assets.
-
Runtime governance. It allows organization to virtualize and centrally control services endpoint addresses.
According to Luc Clement (also writing in Loosely Coupled), omission of registry functionality typically causes the following problems
[Clement]:
- Lack of application consistency and integrity, as well as the lack of governance that can ensure consistency and integrity of design and development of enterprise services
- Difficulty in relating and reusing application functionality, including the inability to locate existing services that solve specific business problems and support required process
- Proprietary, difficult-to-maintain software, as well as direct service invocation that creates tight location coupling between service consumers and providers
Unfortunately, the majority of these publications mix runtime and design aspects of the service repository and registry and consider them as a single system. In reality the two play very different roles in SOA implementations.
In this article you learn about the roles of both the repository and registry and the way each of them is used in SOA implementations.
Service repository
The enterprise-wide nature of SOA [Lublinsky] requires a
significant amount of collaboration between design and
implementation teams.
Analysts designing business processes need to find services providing functionality required by these processes.
Finding a service best suited for solving a particular business problem is critical for successful process implementation.
However, functionality alone doesn't guarantee service applicability.
Questions about nonfunctional requirements, for example, service-level agreements (SLA) or security requirements, help potential service users assess whether a set of services picked for a given solution will work together.
Finally, services dependencies have to be addressed. These considerations allow the analyst to evaluate the solution's overall complexity and the impact of potential changes induced by using the solution.
Technical people are equally interested in the portfolio of existing services, although their interest comes from a different angle.
Application architects focus on service design and implementation, such as whether to build a particular service on top of an existing
application or to introduce a business rules engine for implementation.
Addressing these concerns enables service developers to build a more robust, easier to maintain service implementation.
Infrastructure architects focus on access and procedures, such as whether the service should be placed inside or outside demilitarized zone (DMZ).
These considerations allow them to define the infrastructure requirements the deployment of services and service consumers.
Another area of concern is service utilization, such as peak and average number of service invocations.
This information allows architects to proactively manage service capacity and make decisions for service retirement.
In summary, stakeholders throughout the enterprise, each of them with different
perspectives and objectives, are interested in a wide range of service-related
information. Ad hoc distribution schemes for this information (for example,
spreadsheets) may suffice for managing a handful of services used by a small
community. However they do not scale to meet the objectives of Service-Oriented
Architecture (SOA) (that is, reusability, consistency, etc.). Static HTML pages
providing service information rapidly become out of date. Similarly, the "call the architect" approach to communicating knowledge about available and planned services turn critical resources into information bottlenecks.
In other words, all services-related artifacts in the enterprise-wide SOA implementation
become enterprise-wide assets and require a centralized asset repository that stores all
of the above information. The repository should also provide cooperation capabilities (the
ability to search, modify, etc.) to all of the SOA stakeholders. Such a service repository
integrates all of the sources of services-related information, including design artifacts,
runtime topologies, information collected by service monitoring and management solutions,
service code repository, and so on. It provides a unified representation off all of this information allowing all of the SOA stakeholders to centrally access it, based on their job functions.
See Figure 1.
Figure 1. Basic service repository interactions
A service repository provides information required to support the complete service life cycle starting from its inception through design, implementation, deployment, usage, and maintenance, as illustrated in
Figure 2.
Figure 2. Simplified service life cycle
During system decomposition, business analysis identifies the requirements for new services. These requirements are evaluated against the functionality of the existing ones and the new services are inserted in the repository. As these services get approved, appropriate service contracts are created and stored in the repository as well.
At this point the service moves into implementation. The development team becomes
responsible for creation and maintenance of service implementation artifacts and storing
them in the repository. Once the implementation is completed and tested, the service is
deployed into production. Next the repository information is enhanced with the deployment information.
During normal service usage, its utilization metrics, including information about service
consumers and their location, is periodically imported into repository from service management and monitoring systems. Over time, service usage defects and additional requirements are created and captured in the repository. After appropriate approval they are translated into service enhancements or new service versions, captured in repository as well.
The essential capabilities of a service repository include the following:
Service cataloging and discovery.
The main purpose of the service repository is to provide the ability to find artifacts, based on artifact-specific metadata. This metadata is typically contained in the artifacts themselves. Consequently, a service repository should automatically extract this metadata (based on the cataloging policies) whenever new artifacts are published to the repository. For example, the following information can be automatically captured during service definition cataloging:
- Links to the auxiliary documents imported by the service definition document (such
as XML schema documents, messaging semantics definitions)
- XML namespaces used by the service contract documents
- Name and description of the interfaces and the XML types used by the service contract
- Links to the policies governing service invocation and execution
Implementation of cataloging requires definition of the metadata for every artifact placed
in the repository. The metadata must be rich and flexible enough to support different types of services artifacts stored in the repository, as well as the evolution of each artifact. The metadata stored in the service repository needs to include all the information that business analysts and service designers use to discover existing services and decide on their applicability toward a given solution. Thus, the repository should provide discovery capabilities that are extensible and can accommodate a wide range of domain-specific discovery queries. This requirement typically translates into a repository's ability to support multiple business-related taxonomies.
Validation.
Finding artifacts that do not adhere to the enterprise standards does not provide any
value. As the point of access to service-related information, the service repository
should enforce organizational and domain-specific business rules, ensuring conformance of
these artifacts to the enterprise policies and standards. This ability to enforce
validation rules makes the repository a focal part of SOA governance.
Dependency management.
Service-related information typically includes multiple interrelated artifacts, such as service interfaces, message schemas, implementation code, usage profiles, and so on. In addition, the services themselves can be reused by other services or business processes. As the number of services grows, tracking all these dependencies and evaluating the impacts of changes becomes a difficult task.
The service repository can simplify it by supporting the management of relationships between service artifacts. The repository should provide standard relationship types; it should also allow the organization to extend these types with additional ones based on their additional requirements.
Service evolution and versioning.
Once created, services typically evolve over time. This evolution can be caused by changes
in the service functionality, semantic messaging, and implementation. Many of these changes will require creation and deployment of a new version of the service. In order to track all of this versioning information, the service repository should provide versioning capabilities for all service artifacts, regardless of their type.
Additionally, the service repository should provide subscription to a change/versioning
notification capabilities, allowing interested parties to be notified about upcoming and
current changes. This allows the repository to provide change information to all interested parties -- service consumer development teams. Such a subscription mechanism should allow specifying the types of events that are of interest, thus preventing the subscriber from being flooded with notifications.
Artifacts publishing governance.
As the service repository becomes a centralized collection of all of the information about
its service-related assets, it requires the same governance as any other enterprise assets
repository. This type of governance typically includes permissions for publishing services-related artifacts and artifacts publishing approval processes.
Support for multiple artifacts types.
One of the main challenges in creation of a service repository is a great diversity of
service-related artifacts, including XML documents that define services interfaces and messaging schemas, implementation code, UML diagrams, and text documents. The use of a generic representation for the different asset types can significantly simplify the repository implementation.
(See the Reusable Asset Specification)
 |
Reusable Asset Specification
The OMG Reusable Assets Specification (RAS)
[Larsen] covers the generalized representation of different asset types. RAS defines an asset as a collection of related artifacts that provide a solution to a problem.
An asset may represent a complete solution, including requirements, use cases, design models, component specifications, components, test cases, test drivers, and test data. Or, it may be just a set of use cases and their models and the rules for extending the use cases.
A good asset has the following characteristics:
- It should be easy to use, customize, and apply to another context.
- It should possess the characteristics of good software engineering: tight cohesion, loose coupling, and sufficient capabilities.
- Its purpose and intent should be easy to understand.
- It should be easy to conduct fit analysis to determine the asset's match to a particular context.
To achieve these goals, an asset needs to be more than just a collection of runtime
artifacts (such as, code and components); it should also include artifacts that explain
goals, purpose, motivation, and assumptions. In many cases, these are best captured as subsets of the original requirements and the vision-related artifacts used in the creation of the asset’s runtime elements.
RAS describes assets using metadata captured in a form of XML manifest provided as part of
the asset’s packaging, shown in Figure 3.
The manifest contains at least the asset specification, including attributes such as name,
version, and description. An asset specification can be extended through classification
expressed as a set of simple name/value descriptors and through the declaration of contexts (such as a specific development or deployment context and so on). The asset’s payload is comprised of a collection of artifacts addressing a particular problem. The usage section provides guidance on applying and customizing the asset. Finally, the related assets section defines the asset's relationships to other assets and helps to create collections or families of assets to form larger-grained solutions.
There are many types of assets, each represented by a different RAS profile. The asset’s types are extensible to support customization for particular needs. The asset customization is accomplished through profiles which preserve the core structure of RAS but specify profile specific extensions.
Search engines and repositories can use the manifest file to discover the contents of an asset, its classification, its related assets, and so on.
|
|
Service registry
A service is created in order to be invoked by consumers that require functionality implemented by that service.
Invoking the service requires knowledge of its location; that is, the service endpoint address.
In the simplest case, the endpoint address could be hard-coded in a service consumer’s
implementation.
(This is how Web service consumers are generated in both Java™ and .NET®
environments, based on the service’s Web Services Description Language (WSDL). As a result, it should come as no surprise that many current systems use hard-coded endpoint addresses).
This approach tightly couples the service consumer’s implementation and the service location (location coupling).
Figure 4. Direct invocation of services by consumers
Accommodating service endpoint address changes in such tightly coupled implementations (illustrated in
Figure 4)
requires modifications to the services consumers' implementations.
This is error prone and scales poorly as the number of services grows. Accounting for multiple environments (development, testing, quality assurance, production) only compounds the problem.
Externalizing the endpoint addresses into configuration files offers a potential improvement to this implementation. This approach is more flexible because it removes endpoint addresses from a consumer’s code and externalizes them in configuration files. Doing so allows service consumers to accommodate address changes without changes to service consumer’s code. However, this option also runs into scalability problems as the numbers of consumers and services (and consequently of configuration files) grow.
Usage of a component specialized in dynamically resolving service queries into endpoint
addresses and invocation policies -- that is, a service registry -- provides the most flexible and maintainable solution to this problem. A service registry, in this case, contains all the information about service deployments, their locations, and the policies associated with invocations at each location.
In his IEEE Internet Computing article, "The social side of services," Steve Vinoski points out that the usefulness of service registry increases with the
number of services [Vinoski].
He writes:
The technical line of reasoning for reaching and managing this critical mass typically goes like this:
- For services to operate as a collective, they have to know about each other.
- For services to know about each other, they must either be hard-wired together or be able to dynamically find one another.
- Hard-wiring would be a bad thing, as it implies high coupling and potential
difficulties in replacing one service implementation with another later on.
- To facilitate dynamic discovery, then, services need a place that they can advertise themselves and meet other services.
- Which is, of course, a registry!
The notion of the service registry was initially introduced by the first Web services architecture group, which defined the Universal Description, Discovery, and Integration (UDDI) registry as a "match maker" (broker) between services consumers and providers. The responsibility of UDDI was viewed as providing dynamic choice of the service producer based on the functionality required by the consumer. Its role is similar to that of the Yellow Pages. Despite the push from multiple vendors and standards bodies, UDDI usage for service matchmaking never took off. The majority of today’s UDDI usage is limited to storage of the service WSDL files, which are used during consumer’s design time.
A more practical usage of a service registry is runtime lookup of the service endpoint
based on the service name and policies[8] that are required by service consumer. In this
case services definitions are available during consumer development time, and registry
usage is limited to the run time resolution of the services endpoint addresses and dynamic binding. The architecture for this approach is illustrated in Figure 5.
Figure 5. Basic service registry architecture
The architecture shown in Figure 5 is similar to Java Naming and
Directory Interface (JNDI) implementation, used for dynamic binding of Enterprise
JavaBeans (EJB) components.
The late binding of the service endpoint address lowers the location coupling. This
solution eliminates the need to hard-code service endpoint addresses with service
consumer’s implementation or to store them in configuration files. In addition, the registry allows for centralized management of the service endpoint addresses and the associated invocation policies.
Typical service registry implementations support one of the following two possible endpoint address resolution and routing models:
Figure 6. Direct routing using service registry
Direct routing.
In this model, illustrated in Figure 6, the information required to query the registry resides in the consumer. This information includes the set of supported and required policies. Once the registry finds the matching services, the consumer decides which service to use and routes the requests directly to it.
Figure 7. Intermediary-based routing using service registry
Intermediary-based routing.
This model, depicted in Figure 7, relies on an intermediary to handle
the routing. The service consumer doesn’t have direct interaction with the service.
Instead, all service requests are directed to an intermediary that queries the
registry (with a consumer-specific information), decides which service to use, and
dynamically routes all the requests to it. This model resembles enterprise
application integration (EAI) brokers that receive messages and, based on an internal
registry, route them to the appropriate destination. Many Enterprise Service Bus
(ESB) implementations provide support for the context-based routing intermediary model.
Table 1 compares these two approaches.
Table 1. Comparison of routing approaches.
| Direct routing | Intermediary-based routing |
|---|
| Advantages |
- Provides the best invocation performance
- Provides minimal infrastructure overhead, especially in the case where
message-oriented middleware (MOM) is used as a transport
|
- Provides a centralized point for deciding how to select between potential services, relieving service consumers from having to store and process invocation information
| | Disadvantages |
- Depending on the consumer implementation, changing the consumer policy file might still require restarting/rebuilding the consumer
|
- In the case of different SLAs required for different consumers/services, the intermediary has to be able to support the strictest SLA.
- Overall invocation performance can suffer due to the additional network hop.
- The intermediary represents an additional (sometimes single) point of failure.
- Introduction of intermediary usually requires additional infrastructure.
|
Repository or registry?
As we have seen, both the service repository and the service registry have a place in the
overall enterprise SOA implementation. A service repository is the foundation of enterprise SOA governance, supporting centralized management of all of services-related information, including design, implementation, and usage artifacts. A service registry is the foundation of service location virtualization, allowing for centralized control over location and invocation policies for all of the enterprise services. The key differences between service repositories and registries are:
- The repository contains all of the design and development artifacts of services that the design tools may need at design and build time. In contrast, the registry contains a subset of this information that is required at runtime binding.
- The service repository is optimized for store large amounts of assets and to enable a large user population to make ad-hoc queries to find these assets. The main design requirement, in this case, is flexible classification and query support, along with the scalability of the repository itself as well as a user-friendly interface. The service registry, on the other hand, is optimized for runtime lookups of services endpoint addresses. The main design requirement is lookup performance and high availability.
- Access to the repository takes place within the enterprise boundaries. In contrast, the registry often needs to be accessed from within and from the outside of these boundaries.
 | |
So, in the end, it's not a choice between service repository and registry -- both are
needed as parts of a successful, SOA implementation. Instead, it's a matter of making a determination of where to use which of the two. It is important to keep these concepts distinct, since they serve very different purposes.
Resources Learn
Get products and technologies
-
View demos of software
products from IBM software brands.
-
Download IBM product evaluation versions and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
About the author  | 
|  | Boris Lublinsky has more than 25 years of experience in software engineering and technical architecture. For the past several years he has focused on enterprise architecture, SOA, and process management. Dr. Lublinsky is a technical speaker and author, with more than 40 technical publications in different magazines, including Avtomatika i telemechanica, IEEE Transactions on Automatic Control, Distributed Computing, Nuclear Instruments and Methods, Java Developer's Journal, XML Journal, Web Services Journal, JavaPro Journal, Enterprise Architect Journal, and EAI Journal. Currently Dr. Lublinsky works for a large insurance company, where his responsibilities include developing and maintaining SOA strategy and frameworks. You can reach Boris at blublinsky@hotmail.com.
|
Rate this page
|  |