Strategies for managing reference data in a business rules application using IBM Operational Decision Manager

This article describes common strategies for reference data management in rule authoring and rule execution when using IBM® Operational Decision Manager. It describes the advantages and disadvantages to each approach to help you determine the best approach to use for your business rule application. This content is part of the IBM Business Process Management Journal.

Jerome Boyer (boyerje@us.ibm.com), Senior Technical Staff Member, IBM

Jerome Boyer photoJerome Boyer is an IBM expert on Enterprise Business Rule Management Systems in BPM, SOA and Complex Event Processing deployments. As an STSM, Jerome is the lead BRMS BPM solution architect in IBM Software Services for WebSphere (ISSW). Jerome is the author of "Agile Business Rule Development", published by Springer, 2011.


developerWorks Contributing author
        level

Zhuo (Joe) Fu (jfu@ca.ibm.com), Senior Managing Consultant, IBM

Zhuo Fu photoZhuo (Joe) Fu is a senior managing consultant and solution architect for Business Rule Management Systems. Zhuo is an IBM expert on WODM technologies and the lead consultant in delivering BRMS solutions and services in North America, with more than 15 years of experience in architecture, design and implementation of business rules application, ranging from single dedicated application to large-scale, enterprise-wide IT solutions for Telecom, Finance, Insurance, Manufacturing and Government.



13 February 2013

Also available in Chinese

Overview

All business rule applications use reference data, which is at its simplest a list of values but can also include code, value, type, subtype, effective dates, and so on. The code element is used in the condition or in the action part of the rule. When using IBM Operational Decision Manager (ODM), reference data integration must be done in the authoring components, but may also be done in the runtime. Addressing this integration is an important concern during design and implementation. In this article, you'll learn about the most commonly used strategies for reference data management for both rule authoring and rule execution when using ODM.

The key technical design and implementation considerations presented in this article are based on IBM ODM V8, but also apply to WebSphere® ILOG JRules V7.x. The approaches presented in this article are product version agnostic. We assume the readers have a good knowledge of basic ODM concepts.

Reference data used in business rule applications is usually in the format of a list of single values, such as lists of product types, marketing categories, address types, medical procedure codes, adjustment codes, and so on. Those lists are enumerated. Sometimes the reference data is a combination of value, description, and effective dates. For example, location list, U.S. state list, geography region, and error codes may fit into the format:

  • Code
  • Description
  • Effective date
  • Expiration date

Most lists of values are static and do not change over time. However, some do change and could even be considered very volatile. Some lists have timestamp-based validity. More complex structure may be required for these lists. The list of requirements for managing reference data in a business rule application can be long and includes items such as:

  • How and where to persist data
  • What is the usage pattern within rules
  • Is there any business logic associated with defining the list of values
  • What is the data update strategy
  • Who is the owner
  • What is the cost of development and maintenance of such values

All of these constraints have to be evaluated and designed for both rule execution and rule authoring levels. There is no a unique best approach to addressing reference data integration within a BRMS. This article describes each of the different solutions.


Reference data use cases in rule processing

Reference data is used in the rule authoring environment to control the list of possible values the rule author can use. It is part of the rule vocabulary, defined as domain values. Figure 1 illustrates a condition statement on an expense type. Because the expense type is not strongly typed, users can enter any String value.

Figure 1. Loosely typed attribute - no domain value
Loosely typed attribute - no domain value

Figure 2 shows the expense type with a controlled set of possible values: Hotel, Airfare, Meal, or Taxi.

Figure 2. Strongly typed attribute - domain value set
Strongly typed attribute: domain value set

In a strongly-typed attribute, the user is constrained to the choices offered. There are different requirements and consequently different use scenarios for such a list of values, including:

  • Simple pick-list, like list of States, list of product types, list of expense
  • Simple but long list of pick-list values which may require hierarchical navigations or sorting, filtering mechanisms. (e.g. Zip codes, medical procedure codes,…)
  • Combination of name and value, with name only visible inside rule, but value is tightly coupled with the rule execution, e.g. error code value is set in business object, where description is used in rules.
  • Same set of reference data are used in other components, and might require to be maintained separately from the business logic.
  • Maintained by the business users, preferably integrated within the same authoring environment.
  • Real-time update required, to support high frequency of change, and to address special environment setup, for example to support localization.
  • Immediate runtime access to support high performance constraints.

Environments that support reference data management

Three different environments can be used to manage reference data in the context of a business rule application. The first one is within the rule authoring environment where the rule developer defining the rule vocabulary can use static domain, dynamic domain, and a custom value editor. The second environment is the data source used to save the list of values. It can include database tables, outside files such as XML documents, tabular worksheets, comma separate variables files, and so on. The final environment supports the reference data exposure within the rule processing runtime. Different integration approaches may be appropriate when taking into account the caching techniques, the access protocol, the data availability, the refresh mechanism, the data size, the business requirements and the usage patterns. In many business rule applications or decision service implementations, all the three environments may be used. Let's take a look at each environment.

Rule authoring approach

In the rule authoring environment (for example, Rule Designer or Decision Center), the rule author selects a value as part of the list of values to complete the condition or action in the rule language. The reference data or pick lists are defined, in IBM ODM, as domains:

Static domains
The list of values is defined inside a Business Object Model (BOM). Whenever changes are required, the XOM/BOM needs to be adapted using Rule Designer. Static domains are suitable when updating is rare.
Dynamic domains
The list of values is defined as BOM domain values but the data source is external. The BOM domain may be updated by the rule author at design time inside Decision Center or Rule Designer. This approach is suitable when changes are common and the BOM can be changed easily. No XOM change is needed because the attribute stays within a primitive type like String, int, and so on.
Value editor
The editor is defined for a BOM attribute. When a rule author clicks within a vocabulary token, the editor accesses the list of values in memory. The access can be done directly from the data source or using cache in the Decision Center or Rule Designer JVM. Any changes to the data source are immediately reflected within the rule vocabulary. This option is suitable when changes to the reference data are done very frequently. In Figure 3, the promotional offer has a value editor defined for it.
Figure 3. Value editor for BOM elements
Value editor for BOM elements

Both the dynamic domain and value editor approaches require an external data source to manage the reference data. Table 1 summarizes the pros and cons for each approach for reference data management.

Table 1. Comparison of the different reference data support approaches
Static domainDynamic domainValue editor
Reusabilityminus sign Maintained inside BOM, no reuse with external application.plus sign Maintained inside BOM with access to remote data source. Reference data can be reused and is synchronized with BOM at design time.
No XOM change.
plus sign Used for a real-time access to values during authoring. The BOM does not hold any values. Excellent reuse.
No XOM change.
Implementation effort plus sign Light effort, limited to changing the BOM.minus sign Requires developing Java classes to access the data source. JDBC coding (not complex). Once done the refresh is very easy.minus sign Requires multiple Java classes and configuration, including JDBC and UI code.
Maintenance effortminus sign Changes done inside BOM and Rule Designer.minus sign Changes done in data source and in the BOM using synchronization function.plus sign Changes done in unique place, the database.
Authoring performance.plus sign No need for data population. No performance impact at runtime or authoring time.plus sign Data population and refresh are on demand, and there are no performance issues.minus sign Data is populated in real time. Data access and filtering may add performance overhead.
Caching techniques or tools can be used.
Impact to deployment and executionplus sign Data is held inside BOM, change to the value enforces deploying the ruleApp.plus sign Data is held inside BOM, change to the value enforces deploying the ruleApp.minus sign Data is not fully held inside BOM. Whenever a domain changes, the rule execution could fail.

Data source approach

Reference data, or lists of values, are shareable, reusable components. Any architecture should support centralized definition and management of such data. Master Data Management (MDM) products support this centralized management. MDM offers services to access data from everywhere. When not using MDM, database or external property files can be used to persist data, but it's also possible to leverage the ODM rule repository. The reference data may be defined in decision tables, and BOM uses special dynamic domain providers (see Reference data implementation). The advantage to this approach is that no additional effort is required to maintain a separate data source and data access layer. However, the implementation of the Java® value provider is more complex and uses the ODM product API. External database tables are the most common approach to enforce reference data reusability across multiple applications. This data source type is applicable for either the dynamic domain or custom value editor approach. Data can be easily shared with other applications using JDBC or better by exposing a data access layer using SOAP over HTTP, or RESTful services. In most cases, the data source approach requires the most development effort and maintenance control.

External files like XML documents or Microsoft® Excel®-based worksheets, are often used for ad hoc reference data management. This option is applicable for either the dynamic domain or custom value editor approach. Common Office®-based tools are used to manage the data, so no specific user interface is needed and business users can update the values easily. However, data consistency is a real maintenance challenge, which is often not acceptable for a long-term sustainable architecture.

Table 2 provides an assessment of the different data source options.

Table 2. Comparison of reference data source options
Separate DB tableFile-based (especially Excel files)Decision table inside ODM
Reusabilityplus sign Easy to change the data. Controlled central management, which can easily be reused by multiple applicationsminus sign Version management is a real challenge, as are sharing and governanceplus sign Easy to change the data and maintain it in the same way and same place with business rules.
Implementation effort minus sign DB table definition combined with custom UI for maintaining table content.plus sign No implementation effort because it is supported by ODM features.minus sign Requires specific implementation effort leveraging ODM API.
Maintenance effortminus sign To support business users maintaining reference data, UI must be developed.plus sign No maintenance cost.plus sign No maintenance cost.

Authoring performance and execution deployment are not impacted by the data source type: the lists of values are defined as a BOM domain, so they are deployed as part of the ruleApp. One key aspect of selecting the best reference data approach is to address who is going to do the maintenance, particularly whether it is the same group of people as the one maintaining the rules. In that case, having data defined in same user interface as the Decision Center makes sense. Finally, the solution has to support the change velocity for both data and rules: both need to be synchronously done.

Rule execution approach

When the reference data is declared inside either rules or BOM attributes there is no additional integration done because the rule at runtime contains the translated value. The following snippet is an ILog Rule Language (IRL) example showing a test on a string, where the string constant was automatically set by a BOM domain translation.

When {
….
evaluate (ReferenceDataManager.inAdjustmentCodes(line.adjustmentCode)
…);
} then { …

As a best practice, the implementation of such a tool needs to have data locally. For obvious performance reasons, it's not feasible to do remote call in the condition or action part of a rule. The fundamental approach is to make all the data available before the rule execution, using data caching techniques in the XOM Java object. If there is a cache, you need to assess the cache update strategy:

  • With server and application start up
  • Update on-demand via session API, MBean, message or using product like WebSphere eXtreme Scale
  • Update automatically via daemon processing or other clever mechanisms

An alternative approach is to use the ODM BOM2XOM (B2X) mapping layer. For example, to check whether a skill code belongs to a skill code group, the skill code domain could be defined by a B2X mapping, as shown in Figure 4.

Figure 4. A sample B2X mapping for domain value
A sample B2X mapping for domain value

Reference data implementation

In this section, we'll focus on the dynamic domain and custom value editor implementations because they represent the more complex integration. We'll consider the following requirements:

  • All the reference data should be stored and maintained in a centralized data source.
  • The data includes different lists of values, with different requirements for change management: from rare to very frequent.
  • Although a common BOM project can be used to support multiple rule projects, some reference data might be specific to individual rule projects, which are maintained by a specific group of users.
  • Reference data needs to support localization; in particular, the description visible inside the pick list within the rule authoring needs to be localized.
  • Some reference data is maintained by the rule authors and supports the same deployment model as the rules.

Dynamic domains

Dynamic domains are the domains in which values are stored in an outside data source and set dynamically by the execution of Java-based data access code called a domain value provider. A dynamic domain is only dynamic in the sense that you can dynamically synchronize the domain with the defined data source. After the synchronization, the domain becomes static inside the BOM and requires a new update if any changes occur within the data source. To implement a dynamic domain in ODM, you need to implement the value provider interface IlrBOMDomainValueProvider. Figure 5 illustrates the dynamic domain mechanism.

Figure 5. Dynamic domain mechanism in ODM
Dynamic domain mechanism in ODM

The ilog.rules.shared.bom.IlrBOMDomainValueProvider implementation class is referenced as the domainValueProvider custom property of the BOM class, as shown in Figure 6.

Figure 6. Reference the implementation of the value provider
Reference the implementation of the value provider

It's interesting to note that the BOM class could be virtual and mapped to a java.lang.String; there is no need to have a XOM class. The value provider is recognized when a synchronization request is triggered for the related domain. The implementation class needs to address how to connect to the external data source and how to load and cache the data, and should implement the callback methods like getBOM2XOMMapping(String name) and getDisplayText(…) used by rule editors and translators.

When deployed in Rule Designer, the code needs to be packaged as an Eclipse plugin. You can get a BOM populate sample by importing the plugin called ilog.rules.studio.samples.bomdomainpopulate.

ODM has a set of predefined features to support dynamic domains, like a value provider to load domain data from an Excel file. The worksheet includes a column for the values, one for the label, and one for the BOM to XOM mapping.

Custom value editor

You should consider a custom value editor when reference data updates are required without triggering on-demand manual updates, when the list of values is huge and it doesn't make sense to overload the BOM size, or to improve usability (for example, sorting and filtering capabilities), or when the list is a configurable pick list (that is, the pick list is dynamically populated based on selection criteria, such as a specific user profile or a specific language locale). Implementation of a custom value editor should follow these guidelines:

  • A generic design is recommended to handle different list types without implementing multiple value editors. Data of these domains might come from different data sources, table schema including value and label columns, localization, effective date, expiration date
  • Common usability support such as filtering can be designed and integrated with all the value editors.
  • Use external configuration file to define data source connection properties and any other properties.
  • Assess how to access the data using pure JDBC code, Java persistence API (JPA) implementation, or web service protocols.
  • Clearly separate the access to data from the different packaging and integration patterns offered by ODM: Decision Center and Rule Designer Eclipse plugin.

The following diagram presents the different component of value editor in both environments:

Figure 7. Custom value editor design guidelines
Custom value editor design guidelines

ODM Decision Center

The Decision Center supports almost the same tooling for value provider and value editors. As illustrated in figure below it is possible to reuse the same data access layer and logic developed for Rule Designer, but with the implementation of different API.

Figure 8. Value editor component in th eODM Decision Center
Value editor in the ODM Decision Center

The ODM Decision Center (or Rule Team Server) supports two important extensions you can leverage to implement reference data management in Decision Center: you can add a custom tab in the user interface and you can update the dynamic domain using an API. The typical approach for managing reference data inside the ODM Decision Center is to add a set of JSPs to implement reading, adding, and updating values in a data source. The JSPs are presented as a tab extension, and the teamserver.war is repackaged with those pages.

It's possible to expose such a tab or function only to specific user roles. The rule writers with "rtsConfigurator" role would be able to access the new Reference Data Management tab, shown in Figure 9.

Figure 9. Reference Data Management tab
Reference Data Management tab

The basic components needed for this approach are a BOM definition supporting the different value provider references for the different BOM domains, the code for accessing the data source and the code to present the value, code, and B2X mapping for each list of values, and finally the set of JSPs to support the end user interface.

Figure 10 illustrates the basic component view for ODM Decision Center, which enables you to manage reference data and business rules in a single central front-end.

Figure 10. Custom UI with BOM link
Custom UI with BOM link

Figure 11 presents a possible user interface layout, where three buttons are offered to load a list of values, add a new value, or edit an existing value. The domain name field defines the name of the list of values, where the domain type could specify the basic primitive type of the domain like string or decimal. The DomainList table includes the definition of domain names and types, and references the table name used to save other values. For example, the table DM_DS_EXP_TYPE includes a definition of the expense type domain.

Figure 11. Basic UI, table schema, and value provider
Basic UI, table schema, and value provider

The bottom of the screen enables the user to enter new values. The number of columns (Name, Value) may differ according to the type selected. A mapper component will save the values in one of the DB tables. If a business user defines a new domain not specified in the current DB, you can provide a dynamic structure to support adding new domains, and save it. The Submit button calls the Decision Center API to update the domain values of the corresponding BOM classes.

Figure 12 shows an example of a more sophisticated user interface that supports several different types of data that could require quite different variations due to the nature of the data and maintainability requirements.

Figure 12. Reference data management tab design sample
Reference data management tab design sample

With this set of capabilities, reference data management can support a rich set of features like full synchronization between domain values update and rule vocabulary, custom attributes like effective date and localization, security control access to the data, and multi-tenancy (that is, data grouping and structure per tenant). As illustrated in Figures 11 and 12, Decision Center supports adding a new tab, which may include sub-tabs, each tab responsible for managing one set of reference data. The implementation could leverage JSF-managed beans to clearly separate the user interface component from the data access. The data access object is integrated in the managed beans. The class diagram in Figure 13 illustrates such a design.

Figure 13. JSF-managed bean design for reference data tab
JSF-managed bean design for reference data tab

Rule execution considerations

The best practice for data access in rule execution is to make all the data available before the rule execution occurs; that is, to pass all data into the Decision Server upon execution. This allows for the rule engine to act in a stateless manner, for which it is best suited. Similarly, the same principle must be followed for reference data processing in rule execution. The discussion here focuses on the case where the rule checks whether an attribute is part of a list of values. The ReferenceDataManager class, shown below, caches the list of values and offers a static search operation for each type of value (such as adjustment).

evaluate 
(ReferenceDataManager.getInstance().inAdjustmentCodes(line.adjustmentReasonCode)

To avoid any real-time data access, all the reference data required by the rule execution is available via the data access object (DAO) and cached within the Decision Service in a simple Java Map object. Loading the lists of values is done before the rule execution occurs and centrally managed in a ReferenceDataManager class, which follows the singleton pattern, in which the same Java objects are shared by multiple concurrent rule executions (running in multiple rule engines within the same JVM).

Combining event processing

One of the challenges is to update the cache when the data source is modified. The simplest solution is to let the decision service implementation load the data at startup, and enforce restarting it when the data changes. Obviously this may not satisfy requirements for more frequent updates and high availability; therefore, because ODM includes an event runtime engine and an Event Designer, you can leverage these technologies to define the following:

  • A scheduled event in the event runtime
  • A business object to support <code,value> definitions
  • Web service integration to remotely access the central reference data manager
  • A local service in RES to offer update cache operations

Figure 14 illustrates such an approach using both the Event Server and the Rule Server.

Figure 14. Scheduled reference data access and update
Scheduled reference data access and update

The key activities for the scheduled data access and update are illustrated in the sequence diagram shown in Figure 15, which includes a scheduled trigger event that kicks off the data access and update once a day (for example, at midnight), and the associated action used to load the reference data via a web service call, update the reference data that is implemented using business objects in the Event Server, and then calls the RefData Update operation in RES, which is implemented using Java objects stored inside a static singleton object.

Figure 15. Scheduled Reference Data Update Sequence Diagram
Scheduled Reference Data Update Sequence Diagram

It's important to note that the call to the Ref Data Create/Update web service needs to be carried out for all of the physical rule execution servers within the environment. This is because the web service is dealing with the Java object cache, which is required to be available in each JVM of the Rule Server. In this case, the approach must address how the multiple calls can be executed, and whether there is any synchronization required among these separate calls. Figure 16 illustrates the actual web service call between the Event Server and Rule Servers in a production environment.

Figure 16. Reference data update in a production environment
Reference data update in a production environment

Leveraging WebSphere eXtreme Scale

An alternative solution to reference data management is to leverage WebSphere eXtreme Scale to manage the reference data cache. eXtreme Scale provides a data grid, in the form of <key,value> maps, to store any data across multiple servers, which is important when using multiple decision services for rule processing. There are multiple usage patterns for eXtreme Scale, but for our case, we'll consider using the inline caching pattern, in which the decision service and the authoring environment load data entries by key as they are requested. This inline caching pattern should support most of the requirements for reference data access, and provides powerful capabilities to scale the rule processing architecture. The grid topology can be adapted with distributed caches, where data stored in the grid is spread across all the JVMs.

The eXtreme Scale API is very simple to use, and can be integrated in the decision service Java implementation, as shown in Figure 17. The ObjectMap is the main class used to store reference data maps. and is well-adapted for a non-hierarchical model like list of values.

Figure 17. eXtreme Scale integration with a decision service
eXtreme Scale integration with a decision service

The data loading can be done in the same JVM as the decision service or remotely in another grid container (ObjectGrid) using any JDBC code. The BackingMap contains cached objects that have been stored in the grid. An ObjectMap and a BackingMap are related through a grid session. The reference data manager uses the object map and session to access the different maps of reference data. When initializing the singleton object, the code access is done via the Grid catalog first, then the grid, and then a session to access all the deployed maps. This singleton class is defined in the the ODM XOM, and the method inAdjustmentCodes is verbalized in the BOM, as shown in the code snippet below.

public class ReferenceDataManager {
    // Each reference list of values is a map
   ObjectMap adjustmentCodes;

    private static ReferenceDataManager instance = new ReferenceDataManager();
    ClientClusterContext ccc;
    ObjectGrid grid;
    Session session;
	
    private ReferenceDataManager (){
	 // Connect to the Catalog Server.  
	try {
		ccc = ObjectGridManagerFactory.getObjectGridManager().connect
("cataloghostname:2809", null, null);
	// Retrieve the ObjectGrid client connection and return it.
	grid = ObjectGridManagerFactory.getObjectGridManager().getObjectGrid(ccc, 
"refdatagrid");
	session = grid.getSession();
        // Get the ObjectMap(s)
	adjustmentCodes = session.getMap("AdjustmentCodeMap");
	// Add more map if needed …
	}
.. 

   public ReferenceCode searchAdjustmentCodes(String code) {
	ReferenceCode rc = null;
	try {
		rc = (ReferenceCode)adjustmentCodes.get(code);
	} catch (ObjectGridException e) {
		e.printStackTrace();
	}
	return rc;
   }
	
   public boolean inAdjustmentCodes(String code) {
	return (searchAdjustmentCodes(code) !=null);
   }

The code above is only an example and should not be considered production code. It can be improved by externalizing the different string parameters used, and by doing better exception management.


Conclusion

IBM Operational Decision Manager offers a nice set of features to define domain values as part of the rule vocabulary, from static enumerated domains to dynamic ones, complete with rich value editors. The domains are defined in the BOM definition or when using the value editor, and values are tested in the rules. In some cases, lists of values are part of an external data source, and rule conditions look at a value as part of a list. This pattern needs to be evaluated very early in the development process because it may require complex caching and cache management use cases to support it. Reference data management is always an element of business rule application development, so failure to consider it during the analysis and design of the ruleset will impact rule maintenance over time.

Resources

Learn

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Business process management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Business process management, WebSphere
ArticleID=857596
ArticleTitle=Strategies for managing reference data in a business rules application using IBM Operational Decision Manager
publish-date=02132013