The need for introducing state to the service integration layer
Most enterprises serve several channels as entry points into their business. A bank offers ATMs, branches, Internet, and telephone banking as ways for customers to access their accounts. A retail chain has stores and a retail website on the Internet. A government agency offers its services over the Internet, through a call center, or at its public offices. In all of these cases, the offered information and related functionality should be consistent across all channels, which makes a case for establishing a common layer, the integration layer, that is located between various back-end systems and channel-specific front-end logic.
This integration layer contains a service bus with mediations. These mediations support non-semantic aspects of exposing existing provider logic. They also contain a service creation component in which existing providers are semantically composed into new services. For a detailed description of these concepts, I strongly recommend this article by Greg Flurry and Kim Clark. Figure 1 shows the structure of the integration layer as presented in that article.
Figure 1. The integration layer
In order to support requirements related to speed and agility, especially when focusing on customer-centric business demands, the need arises to push information closer to the channels that require it. In other words, the integration layer cannot rely entirely on existing back-end systems to offer information; it must be able to deliver important data very quickly, and that is often only possible if it stores this data locally. And that, in turn, requires the integration layer to be stateful. Given the customer-focused nature of many of the related business initiatives, what is most often stored locally, in the broadest sense, is customer-related information.
Hence, we need to support pushing (customer) data closer to its consumers whenever possible. And that comes with a number of challenges which we will analyze more closely below.
Different types of cache
There are two types of caching that play a role in the service integration layer. One is response caching. In a nutshell, response caching is the ability of predicting the outcome of a service invocation without actually executing the service provider. This can be most useful in cases where the exact same request message is sent to a service many times, and where each of these requests predictably leads to the same response message. It makes sense to establish this type of caching in the service bus, and it does not really violate the statelessness of the bus. Most products that support hosting services and service bus mediations will offer this type of caching. Again, an environment will only benefit from response caching if service invocations are repeated frequently, and if you can guarantee the accuracy of the information stored in the cache.
A different type of caching is what I will call operational caching. Here, information is used not by bus mediations, but by the provider creation part of the integration layer. Figure 2, based on a figure from the ESB article mentioned earlier, shows the different components of the integration layer, plus the two caching components.
Figure 2. Caches in the integration layer
Various architectural decisions have to be made when deploying an operational cache:
- The structure of the data that is cached. Typically, the data in the cache will be accessible by simple key; that is, no elaborate query capabilities are supported. If sets of business objects are stored, they can be put in the cache as single entities or be broken down into smaller pieces that have to be reassembled as appropriate by the calling logic.
Related to this, indices might have to be built that allow navigating relationships between entities that are stored separately in the cache. For example, in a banking scenario, if customer and account data is cached, the relationship between customers and their accounts might have to be navigable in both directions. That means it should be possible to retrieve all the accounts for a given customer and to retrieve the customer owning a given account. Index tables can help with this, but in any case, this requirement often results in two necessary steps to retrieve a piece of information.
- The lifetime of data in the cache. The lifetime of the data is influenced by multiple criteria, for example:
- Are all updates to the underlying back-end data store going through the integration layer, or can updates occur otherwise? If data is only updated through the integration layer that hosts the cache, it is easier to keep the cached data accurate.
- Is it acceptable that data in the cache might not be completely synchronized with the back end? Based on consumer requirements, changes to a business object that originate elsewhere in the system might not have to be reflected in the integration layer cache right away.
Different caching technologies can offer different types of lifetime support for cached content. The content could be removed from the cache based on last usage, based on when it was loaded, or be removed explicitly by a client application.
- The approach for loading data into the cache. There are two main principles for loading data into a cache:
- Side cache: Here, the client logic using the cache is responsible also for loading content. The logic will look for data, and, if it is not found, will retrieve it from the back end and store it in the cache for future use.
- Inline cache: In this case, the cache contains logic that is capable of retrieving data from the appropriate back end. The client logic attempts to retrieve data from the cache, and, if it is not there, the cache itself will load it.
- Client loader: Content that is very static can also be loaded into the cache offline. For example, cross-reference tables that only get updated occasionally can be pre-loaded into the cache by separate applications, so that no data has to be loaded into the cache during normal production use.
- Access modes and update scenarios. A cache is not a system of record. And it is (usually) not a transactional resource; it cannot participate in a distributed transaction. This means that if the content of the cache is updated, the related update to the back end – to the system of record – runs independently, and potential exceptions have to be handled manually. For example, if an update to a remote back end times out, you don’t know if the related transaction completed or not. This means that the content of the cache might have to be manually synchronized with the back end, or be removed altogether.
Moreover, if multiple concurrent threads potentially access the same information in the cache, it has to be defined whether any type of lock can be obtained on cache content, or if and how optimistic locking conflicts would be resolved.
I don’t believe that any of these considerations are specific to a service integration layer, or to the concept of SOA. The same set of questions will most likely have to be answered in any scenario where caching is assessed for better performance of a solution. Again, what makes this particularly interesting – and what I have found to be the context in which I encounter it more and more often - is the desire to move data closer to its ultimate consumer, and the service integration layer is one of the places where that can happen. But keep in mind that based on your specific use case and requirements, neither response caching nor operational caching might be possible or beneficial.
Another aspect that is increasingly represented as a separate component in the integration layer is a decision-making part: a rules engine. As Greg Flurry and Kim Clark point out in their article, the integration could well include business logic, or rather, logic that touches on semantic aspects and which is owned and driven by business goals. This type of logic lives in the provider creation part of the integration layer.
Much of this type of logic is related to how existing providers are composed in order to create a new service provider, and within these compositions, business decisions have to be made. Traditionally, the decisions that are part of the “business integration” logic were implemented within the provider logic. But in cases where the criteria leading to a decision outcome change frequently, or at least regularly, it makes sense to delegate the decision making logic to a business rules engine -- which uses business rules that can be developed and maintained even by business people, without the need to engage IT.
Again, this logic belongs in the provider creation part of the integration layer and is thus not part of the service bus. Figure 3 shows the additional component we have added.
Figure 3. Adding a rules engine to the integration layer
Of course, this is not to say that the rules engine only exists in the integration layer. It can be leveraged elsewhere in the overall architecture. Here, it handles (business) decision making aspects of integration as a means of promoting additional separation of concern as a core architectural principle.
Staying within the earlier theme of increased business focus on the customer, enterprises not only want to have customer information readily available to all of their respective channels, they also want to be able to react and adapt quickly to customer behavior, market needs, and rapidly changing business goals.
To an IT organization, this means introducing the ability to consume relevant business events, correlating them as appropriate, and triggering resulting actions. The event sources span the entire IT landscape, from back-end systems all the way to front-end systems.
The one place that is aware of both the front-end and channel systems as well as the core back-end environments is the integration layer. Thus, it often makes sense to add the event handling component to that layer.
Still, you could argue that in a well structured architecture, such an event handling component does not belong in the integration layer. And that can often be the case; however, I have seen several concrete cases where the relationship between events and services is so close that positioning both in the same layer (the integration layer) makes sense.
To my knowledge, the terminology describing the relationship (and possibly overlap) between service orientation and event processing is not very well defined. Either of them seems to represent a different software engineering discipline with its own terms and concepts. I won’t attempt to solve this here. My discussion is limited to the handling of business events in the integration layer, as a separate architectural component besides the service-related components that exist in that layer. To be sure, both interact with each other: events trigger services, and services trigger events.
Moreover, some of the elements of a service bus can be reused when dealing with events: event data might have to be transformed to support canonical models, event consumers and producers might only communicate via specific protocols, or the service bus might handle routing event data to and from the right places. Thus, putting the service bus in front of the event processing component is often efficient, which is yet another argument for placing the event handling into the integration layer (which, of course, contains the service bus).
The relationship between business event handling, decision making, and caching
Now that we have added three new components to the integration layer, namely the cache, the rules engine, and the event handler, a question to consider is if and how these new components are interacting with each other.
- Caching and events
The cache might want to publish interesting events; for example, when content is added to or removed from the cache, or when it changes. However, in my opinion, those are not really business events, and monitoring the state of the cache through events is usually not warranted.
But utilizing the operational cache makes a lot of sense for the event handler, especially considering cases in which events have to be correlated in order to trigger the right action. Data stemming from multiple events, possibly received over the course of a long time period, must be stored so that it can be put into a common context. Moreover, the data contained in the actual event might not be sufficient to detect the appropriate correlation; additional data might be required that either already exists in the cache or is stored there for later use.
- Caching and rules
The relationship between caching and the rules component is very similar. Since the caching component is normally a passive element (that is, it is asked to look for data, or it is told to store data), it will rarely, if ever, delegate any decision making to a business rules processing engine.
However, if a rules engine has to rely on state data to make certain decisions, then this state data might well be contained in the cache.
- Rules and events
This combination of components makes a lot sense. Event correlation, and the need to trigger appropriate action, is all about decision making. The decisions are based not on one set of input data that is available all at once, but instead on a set of events that happen independently from each other.
The most common use case that I find in these types of environments is that both the event handler and the decision engine depend on the cache to store data they need, and that the event handler delegates at least some of its processing to the rules engine, especially in cases where the rules are created and maintained by business personnel.
The need for an integration layer that includes both a service bus with mediations, as well as a service creation layer that enables creating new services from existing providers, is increasing. And, in many cases, there is an additional need for decision-making and event correlation, paired with the realization that relevant state must be cached in the integration layer to be available more quickly and more easily.
Hence, we have enhanced the architectural model by adding a caching component, a decision making component, and a business event handling component. Enterprises have begun to build systems that include these new components, and I expect this trend to become even more commonplace in the future.