Information modeling is vast discipline. This article focuses on information modeling in the context of a service-oriented architecture (SOA) and, more specifically, on the forces driving the design of data models used in interface specifications.
The approach to interface modeling has matured over time, following the evolution of the way of thinking about process and system integration. When the IT landscape of an organization was a set of disconnected and independent silos, interfaces could be designed and managed as something private to a single application. The adoption of integration hubs to implement enterprise application integration (EAI) solutions increased the focus on the use of canonical models as a lingua franca between multiple systems. Then, with SOA, there has been the realization of the importance of separating integration from business processes and creating services that could be used across the organization in the context of multiple processes, as shown in Figure 1.
Figure 1. With SOA data modeling crosses the boundaries of business units
However, just as any other enterprise-wide initiatives, many SOA programs suffer from the conflict between tactical, project-focused constraints and enterprise-wide strategic aims. Anybody experienced in solution delivery will know that is not possible to build reusable enterprise services based on the requirements of a single project. There are established architecture methodologies, including The Open Group Architecture Framework (TOGAF) and service-oriented modeling and architecture (SOMA), which can be used to identify an enterprise-wide service portfolio; but they stay at a the level of abstraction required to address enterprise planning concerns and do not go to the level of detail necessary for implementation. The amount of analysis required for a full, top-down and enterprise-wide service specification would go against the very same business-agility that SOA enables.
Industry models can fill this gap by providing analysis and design artifacts that we can use as blueprints and standards for SOA projects. Industry bodies or single vendors, with the aim of combining expertise and industry best practice in a usable form to accelerate the delivery of business solutions, create them. Industry models benefit from the experience of hundreds of organizations and their years of development.
Examples of well-established models include The IBM Banking Industry Models, ACORD and Origo standards, and the TM Forum Information framework (SID) for Telco (see Resources).
Industry models tend to be generic and extensible to satisfy the needs of multiple organizations and reused across the different business functions of a single enterprise. They are a natural fit to model interface payloads in a service-oriented architecture. Will this then deliver enterprise-wide reusable services, exposed with standardized data models?
The answer cannot ignore the following two fundamental constraints of SOA:
- Reuse creates dependencies
Even when service consumers have a high degree of decoupling from the provider, they still rely on the fact that a provider must be there to understand their request and process them within the constraints of a service level agreement. The provider is also intrinsically dependent on the consumers: the more consumers reuse the service, the more disruption a change in the provider can create. In summary, the level of reuse that is truly beneficial to an organization depends on the ability to manage the governance challenges that come with it.
Figure 2. Increasing reuse creates more dependencies in the architecture
- Generic interfaces are difficult to consume.
The more requirements a data structure is designed to satisfy, the more the structure is bound to be large and complex. Consider the example of an XML schema used to model an "Account" in banking. To capture the amount of information required to cover all the possible scenarios in which an account is used the schema will include a high number of attributes and will leverage other normalized data types ("Product", "Party", etc…) in an extended tree structure. While in a single scenario, for example a balance transfer, a service request might need only a very small number of basic Account attributes. If the service operation uses the generic, standard object model the consumer will have to deal with a complex data structure sparsely populated, as shown in Figure 3. This makes the service contract very loose: the schema defining the data exchanged in an operation end up being a generic container that does not specify the exact list of attributes required as input and returned as output.
Figure 3. A single service operation might require only a few attributes of a “standard” data structure
The more service contracts are generic and context independent, the more they are reusable, but only to a certain extent. The more a contract is generic the harder it becomes to understand and use; consequently, other parties will be less interested in reusing it. Figure 4 graphically expresses this point. Note: This diagram is meant to communicate an idea and is not based on quantitative measurements.
Figure 4. Interfaces that are not usable will not be reused
Industry models are not out of the box solutions, their adoption requires an effective customization and, above all, a clear governance model. One of the most fundamental architecture and governance decisions to make is the scope of customization and reuse of data types. Is the "customer" definition in one service interface going to be different in other services, or the same for the whole portfolio? Answering this kind of question determines who can change the data definitions and what the lifecycle of that change can be. The scope of reuse of data types must be a conscious decision; there is no one-size-fits-all solution. However, a few different patterns are emerging in the field.
Pattern 1: One object model per service interface
Using an independent data model for every service interface, as shown in Figure 5, assures the highest level of decoupling between services. Because the owner of the service is in complete control of the interface the interface governance is simplified. Yet, the lack of standardization across multiple interfaces creates additional costs throughout the lifecycle of the service. In particular, a consumer has to understand different representations of the same business entities across multiple services and has to cope with all the relative data transformations. Nevertheless, this strategy can be very effective for coarse-grained service operations that do not exchange large amount of data with consumers.
Figure 5. In Pattern 1 every service interface exposes an independent object model
Pattern 2: One object model per business domain
With this approach, services are organized in domains, every domain sharing the same object model. Domain boundaries are determined by business competencies, for example "Sales Planning" or "Product Fulfillment", and can be designed using business architecture techniques, for example the IBM Component Business Model shown in Figure 6.
Figure 6. IBM Component Business Model can be used to define service domains
This solution tries to strike a balance between the different forces discussed here: data types are reused within a group of services that tend to be naturally cohesive because they are related to the same business area (or "domain"), while the data structure used by unrelated services can evolve independently, as depicted by Figure 7.
Figure 7. Different service domains will expose different object models
The downside is that, once domains are defined, changing their boundaries can be costly. Additionally, there might be areas of the service portfolio where it is difficult to identify a clean separation between domains; you may have to organize services according to technical domains (for example all the services implemented by platform A versus those implemented by platform B). This is not ideal; it is an "IT centric" model that couples the characteristics of the service landscape to its technical implementation. However, it is useful to have a degree of pragmatism; if the overall service governance is strictly based on technology platforms ownership you may have to use the same criteria to partition your service domains. Also, a clear anti-pattern, and a very common one, is to identify domains with implementation projects. This is typical of organizations in which the scope of reuse of object models is not a conscious enterprise architecture decision but an afterthought left to delivery projects. In this context domains pop-up, evolve, overlap, and intertwine according to tactical decisions. Every generic, business-aligned object model promised by a single initiative is bound to turn into the legacy model that the next project will try to be decoupled from.
Pattern 3: A single object model for the whole enterprise
Many organizations look at SOA as a way to expose IT assets as business-aligned, standard-based services easily reusable across different departments. It can then appear natural to use industry models to define a single common set of data structures shared across all the enterprise service interfaces (See Figure 8). The amount of model customization is kept to a minimum and its management is centralized. This approach simplifies the design of data models at the enterprise architecture level and promises to maximize reuse.
However, it makes the challenges I described at the beginning of this article particularly relevant. For example, defining effective service contracts will require the augmentation of generic data type constraints with context specific validation, as I describe in the patent application of "Message Validation in a Service Oriented Architecture" (see Resources).
Figure 8. All the enterprise service interfaces might leverage a common object model
This governance pattern is effective only in environments that have very strong governance processes in place and that do not have to cope with a high rate of change. Many organizations start with this strategy but then move to the service domain approach described in Pattern 2 when they experience the challenges involved in managing it.
Some good practices are valid most of the times. Regardless of the governance you might adopt you will always have to manage the fact that the object model will change and that the same information might have different representations.
This section lists a few key guidelines:
- Maintain an enterprise-wide common logical data dictionary and
map every physical object model to it.
Using a common data dictionary removes ambiguity in the definition of business requirements and enables the translation between different data representations, as shown in Figure 9
Figure 9. Different customization of a common logical model might be required to satisfy different requirements
It is important to recognize how this standardization provides benefits in its own right, without the need to use the same physical object model everywhere. The logical model is providing a core vocabulary that rationalizes what a company means by generic terms like "policy", "account", "customer", and "product" across different business domains. This will have benefits well beyond IT. Similarly, the initiative will struggle to succeed if its sponsors come only from the IT community.
- Decouple the service exposition model from the service
With "exposition model", we refer to the object model used in the definition of service interfaces, while the "implementation model" is the one used in the service implementation (see Figure 10).
Figure 10. Exposition Model versus Implementation Model
The first model is public, while the second should be kept private so that consumers remain decoupled from the service implementation. Even if the two models might look identical initially, their ownership and the lifecycles are usually different. The implementation model is owned by the service delivery team and during development goes through frequent cycles of change. Yet, the owner of the service exposition model usually controls a portfolio of services and approves any modification only after considering the impact across the portfolio and on the consumers. Changes to the exposition model will then happen less frequently. Keeping the two models separate creates a "buffer" that allow them to evolve at different speeds.
One might argure that this decoupling is an unnecessary overhead for components whose implementation doesn't need to understand and manipulate the majority of the payload exposed by their interface. Those components certainly exist, however in SOA they are typically enterprise service bus (ESB) elements and they should not be referred to as "service implementations" (Flurry and Clark explore the details of this distinction in their article "The Enterprise Service Bus, re-examined" (see Resources).
- Create interface specifications that are complete and easy to
The interface specification is the only thing that you want consumers to know about your service.
It is outside the scope of this article to go into the details of the debate of strongly typed versus weakly typed interfaces, however, it is useful to point to the fairly common misconception that a change in the data exchanged by a service doesn't affect the consumer as long as the signature of the interface is not modified. Take for example the technique, commonly used in industry models, of including lists of key-value pairs in an interface payload with the aim of minimizing the impact of a customization. There are some circumstances where this is appropriate; but, it is often a false economy because it "hides" details that can be vital for service consumers. What is the information held in the list? Is anything been added or removed? What are they keys? How are they spelled?
The more difficult it is for the consumer to get these details the more costly accessing the service is going to be. For similar reasons, using advanced, not commonly used data and protocol constructs should be avoided. The easier you speak the more likely it is you are going to be understood.
This article highlighted how industry models can help architects in the difficult task of defining enterprise service interfaces and reflected on the need for striking a balance between competing design principles, in particular "reuse" and "loose coupling". We learned how the type of SOA governance model embraced by an organization is in itself an important element to take into account for an effective service design.
Special thanks to Kim Clark and Scott Glen for their contributions and reviews of this article.
- The Open Group Architecture Framework (TOGAF): Learn more about this high level and holistic approach to architectures.
- Service-oriented modeling and architecture (developerWorks 2004): Review highlights of service-oriented modeling and architecture and the key activities that you need for its analysis and design.
- The IBM Information Framework for the banking industry: Explore this blueprint for SOA and data warehousing solution success.
- Building Service Oriented Banking Solution with IBM Banking Industry Models: Read this IBM Redpaper that provides important positioning information and detailed tooling guidance.
- ACORD Standards Organization: Visit the home page.
- Anatomy of the ACORD TXLife XML standard (developerWorks, 2011): Learn about the structure of ACORD TXLife messages, the challenges implementers face, and the tools and techniques to use it successfully.
- Origo Standards: Find more about these standards that are designed to meet the requirements of specific business processes, supporting a range of retirement, investment and protection products.
- TM Forum Information Framework (SID): Read more about this industry-agreed definition for information that flows through the enterprise and between service providers and their business partners.
- IBM Component Business Model: Find more information on this component-based approach to strategic change.
- The Enterprise Service Bus, re-examined (developerWorks 2011): Examine the architectural role of the enterprise service bus in a service-oriented architecture.
- Message Validation in a Service Oriented Architecture: Read this patent application to gain a better understanding of why defining effective service contracts will require the augmentation of generic data type constraints with context specific validation.
- developerWorks on Twitter: Join today to follow developerWorks tweets.
- developerWorks podcasts: Listen to interesting interviews and discussions for software developers.
Get products and technologies
- IBM product evaluation versions: Download or explore the online trials in the IBM SOA Sandbox and get your hands on application development tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.
- developerWorks profile: Create your profile today and set up a watchlist.
- The developerWorks community: Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.