Achieving complex event processing with Active Correlation Technology

Rule your domains with rules to trigger automated processes

Active Correlation Technology (ACT) rules can turn low-level events into high-level (complex) events to help the gleaning of business opportunities, or to better understand problems. It can also free up personnel by using ACT's complex event-processing to trigger automated processes.

Share:

Ana Biazetti (abiazett@us.ibm.com), Event Management and Correlation Architecture, IBM

Ana BiazettiAna Biazetti is a member of the Event Correlation and Automation Architecture team, where she serves as Chief Designer for the ACT component. She has worked for IBM since 1992, with various assignments in network management, telecommunications management, and event management. Ana holds a B.S. degree in Software Engineering, and her interests include event management, event correlation, complex event processing, security, and Web services.



Kim Gajda (kgajda@us.ibm.com), Chief Programmer, IBM

Kim GajdaKim Gajda is the Chief Programmer for the ACT component. She has worked for IBM since 1989, with various assignments in retail store solutions application development, security application development -- including identity manager and risk manager -- and event correlation. Kim received an M.S. degree in Information Systems Design in 1989, and her interests include complex event processing, event correlation, and database application design and development.



15 November 2005

Introduction

Today's diverse interconnected e-business components typically come with a lot of event information generated by touchpoints through log files or event emitters. Correlating event information to derive symptoms, or higher level business conclusions, is fundamental to identifying critical situations that need to be corrected. This article describes the IBM Active Correlation Technology (ACT), which provides built-in patterns that support event correlation and complex event processing.

ACT is a technology that is in the works at IBM. You will see it showing up in our products in the future. At this point, however, ACT is not available to be embedded into your own applications. However, if you understand the benefits that this new technology provides, you'll be better able to understand the direction in which autonomic computing technology is headed. Read this article for a sneak peek at what types of functions you'll be seeing in the future. As always, we like hearing what you think; chime in with your thoughts on the autonomic computing discussion forum in the Resources section of the article.

The article provides a brief overview of ACT, which is a set of modular event correlation components that deliver complex event processing functions, such as:

  • Aggregating and filtering events
  • Correlating and associating events for problem determination and detection of business situations
  • Triggering automatic actions in response to events that cause situations
  • Associating events with business information

ACT includes support for events that conform to the Common Base Event specification and other messaging formats. ACT is a technology that is being embedded in different IBM products and offerings.

Benefits

Any customer with a data center, trying to manage a complex IT infrastructure, can benefit from a solution or product that embeds ACT. By using ACT to detect symptoms, customers can:

  • Reduce the number of events their operation staff needs to handle by filtering out spurious events, removing duplicates, and summarizing a collection of events
  • Correlate lower-level events into a meaningful diagnosed symptom that provides higher level or better information for problem determination
  • Gain the ability to take autonomic actions and solve the original problem using corrective actions.

Architecture

ACT is a software development kit that includes code libraries, APIs, plug-ins, and documentation to help you embed correlation technology into your application solutions. ACT is not a product. It provides cross domain correlation through a run-time environment, and tools, to help develop and execute rules for correlating and filtering events across many different environments.

ACT can provide complex event processing to derive high-level, or complex, events from the analysis, correlation, and summarization of low-level events in event-driven systems. The complex events are suitable for notifying people of business opportunities, or problems, in easy-to-understand terms, or for triggering automated processes.

Figure 1 shows the overall architecture of ACT.

Figure 1. ACT architecture
ACT architecture

You can create correlation rules, specified in terms of the supported patterns, by using the ACT rule builder tool. The rules are then loaded into the ACT engine through the ACT run-time environment. As incoming events arrive, at different times, each event is matched against the patterns and one or more rules are triggered. When a rule is triggered (such as timeout, receipt of an event, and so on), a response occurs that includes the execution of actions.

The ACT subcomponents are described below.

Rule builder
A graphical user interface (GUI) that lets you write correlation rules in the ACT rule language. Figure 2 shows the ACT rule builder, which lets you easily define a rule set consisting of rule blocks and rules.

The input to the ACT rule builder includes event definition information that's used to select events to be processed by the ACT rules. The ACT rule builder also lets you incorporate snippets of code and specific actions into rules.

The output of the ACT rule builder is a rule set, in an XML document, which defines the rules that are the basis for event correlation. Within the rule definitions in the XML document, actions are defined to indicate what is to be done as a result of correlation activity.
Figure 2. Rule builder overview
Rule builder overview
Rule language
Is XML-based, and lets users specify rules based on common correlation patterns. Rules created using the ACT rule language can be deployed to ACT run-time environments.
Run-time environment
The subcomponent or application that embeds the ACT engine and the ACT compiler. It enables the ACT component to work properly in an application solution. Different applications use ACT in different environments, and listen and receive events in different circumstances. For example, a solution may drive the ACT engine with events from queues, event logs, or a Java™ Message Service (JMS) subscription. The ACT run-time environment typically includes the ACT engine and the ACT compiler.
Engine
Provides the core service of receiving events and processing them against loaded rules or correlation patterns. The engine is embedded as part of an application (run-time environment). More than one instance of an ACT engine can be embedded and controlled independently. The ACT engine depends, indirectly, on the ACT compiler, and supports all rule patterns defined in the ACT rule language.

As shown in Figure 3, within the ACT engine the correlation rules determine the specific event patterns to be detected, and the time during which to look for the event patterns. As events are processed, they are selected to participate in rules according to selection criteria. This is represented in the events selector block of a rule. The events fill in the patterns specified in the pattern condition. When a pattern is completed the rule is triggered, and actions are taken in response.

ACT provides basic actions, and you can also include customized actions.
Figure 3. ACT engine
ACT engine
Compiler
Compiles ACT language files into a data structure that can be understood by the ACT engine. The compiler usually resides in the same run-time environment as the ACT engine, but it can also reside in a different environment or system, where its serialized output can be transferred to the system on which the engine runs.

ACT rule patterns

ACT provides built-in patterns that support event correlation and complex event processing. ACT supports:

  • The correlation of events in both stateless and stateful (or temporal) modes
  • The specification of rules that permit input events that sequentially arrive at different points in time to be correlated according to well-defined patterns
  • A conclusion with one or more responses to be generated based on the outcome of the correlation

The simplest type of correlation is a stateless rule, also called a filter or match, where a single event that passes a filter condition immediately generates a response. On the other hand, the stateful rules provide correlation across multiple events occurring within a given time interval.

The set of base correlation patterns supported by ACT have been proven, by experience over the years, to cover most of the event correlation problems that IBM customers need to address. These are the built-in capabilities that must be available in the event correlation environment, and that provide the basic building blocks on top of which more complex correlation can be created (for instance, by using composite rules). The base patterns are designed for simplicity and abstraction to the user, but also for high performance processing capability at run time.

Filter pattern

The filter pattern checks each event to determine if it matches an event selector. If a match is found (the expression is true), an action may be taken as specified in the rule. Actions taken might include filtering events in and out of the event input stream.

Figure 4 shows a flow of events that occur during a time period. The event selector box indicates the event type that triggers the filter rule. When the rule is triggered by the event, the onDetection response is executed.

Figure 4. Filter pattern
Filter pattern

For example, you could use the filter pattern for a rule that pages an Administrator if a ServerStatus event indicates a serverLoad greater than 95%.

Collection pattern

The collection pattern is an example of stateful correlation. In this pattern, events are collected over a time period. At the end of the time period, the events are available for use within the onTimeWindowComplete response action.

Figure 5. Collection pattern
Collection pattern

A common use for this pattern is to collect events matching a specific event selector and to summarize them into a single event containing the total count of events, including characteristic information about the events summarized.

Duplicate pattern

The duplicate pattern is a special form of the collection pattern that's used to detect duplicate events. The first event received is processed by the engine in a normal fashion with the detection response; it is saved, and passed to the other rules. All subsequent matching events that occur during a specific time window are processed by the onNextEvent response. They are not saved in the rule, and they only increment the duplicate count. Then an implicit exit rule set is performed, causing the events not to be processed further. Further actions may be taken at the end of the time window in the onTimeWindowComplete response.

Computation pattern

The computation pattern is another specialized form of the collection pattern. In this pattern, a computation function is executed every time an event that matches the event selector is received. For example, if ACT is processing customer order events, each time an event is received the total value of the order is added to the total value of all orders that occurred during the time window specified. Actions may be taken at the end of the time window in the onTimeWindowComplete response.

Threshold pattern

The threshold pattern is a stateful pattern. As events are received, a threshold is evaluated based either on event count or on a computation across all collected events. The threshold is evaluated in a time window, which can be fixed or sliding.

Figure 6. Threshold pattern
Threshold pattern

An example of using the threshold rule is to execute an action to check the status of a router if more than four "server unreachable" events from a subnetwork happen in a sliding window of 30 seconds.

Sequence pattern

The sequence pattern is used to detect the presence or absence of an ordered or unordered sequence of events within a period of time.

Figure 7. Sequence pattern
Sequence pattern

Sequence detection occurs when trying to act on the root cause of a set of events. For example, in an IT environment an administrator would want to reset the DB2® heapsize if both the WebSphere Application Server Resource Allocation Exception and the DB2 ERROR SQL0954C "Not enough heap to process statement" were encountered. A rule is written to watch for those two events. When they are encountered within the specified time interval, an action is executed that increases the DB2 heapsize and restarts the database manager.

To understand the process for identifying a treatable symptom, or cause of a problem, and its flow in the Monitor, Analyze, Process, Execute (MAPE) loop, let's use another example. The following events are emitted by the WebSphere Application Server (Application Server) and a Cisco router. An application running inside an Application Server instance uses a table stored in DB2. The communication between the application on Application Server and the DB2 server is through a Cisco router. At some point during operation, the link between the router and the DB2 server goes down. After that, the application on Application Server tries to communicate with DB2 and gets a failure. The symptoms identified below 1 are:

WAS_CONNECT_CAPTUREIdentifies that the Application Server application cannot connect to the DB2 server
CISCO_AVAIL_CAPTUREIdentifies that a link on the router is unavailable
WAS_CISCO_CAPTUREIdentifies that both of the previous symptoms have occurred (our sequence rule)
FIX_SUCCESS_CAPTUREIdentifies that resolution has occurred successfully

Timer pattern

The timer pattern provides a simple timer that goes off at the end of the timeWindow. The timer always repeats unless the repeat attribute is set to false using the onTimeWindowComplete response. The timer pattern allows for a response when the timeWindow completes (onTimeWindowComplete response). The timer pattern can be used to implement cleanup rules. For example, every 30 minutes, execute an action that cleans up harmless and informational events that have been open longer than 48 hours.


Summary

This article provided an architectural overview of the IBM Active Correlation Technology and each of its components. We described the built-in correlation patterns, which cover most of the event correlation problems our customers need to address. Some examples showed how customers could typically use ACT for symptoms detection.

Expect this component to become a key integrating technology for applications and solutions that need to correlate Common Base Events and other event formats.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Tivoli (service management) on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Tivoli, WebSphere
ArticleID=98468
ArticleTitle=Achieving complex event processing with Active Correlation Technology
publish-date=11152005