Use Drools and JPA for continuous, real-time data profiling

Program JPA POJOs as facts in your Drools 5 working memory

You can integrate Drools with JPA and Spring-based application code, and you can do it without resorting to intrusive imperative-style programming. Learn how to cost-effectively program business requirements into your real-time system monitoring and continuous data profiling process with POJOs. Author Xinyu Liu shares his expertise in Java™ persistence and business integration technologies, including advanced tips for using Drools 5 to build applications that are memory efficient and bullet proof.

Xinyu Liu (dr.xyliu@gmail.com), VP Product Development, eHealthObjects

As a Sun Microsystems certified enterprise architect, Xinyu Liu has intensive application design and development experience with cutting-edge server-side technologies. He took his graduate degree from George Washington University and currently serves as VP Product Development for eHealthObjects, a healthcare technology company providing innovative healthcare products, solutions, services, and exchange platforms that can seamlessly be integrated with other systems, services, and applications. Dr. Liu has written for Java.net, JavaWorld.com, and IBM developerWorks on topics such as JSF, Spring Security, Hibernate Search, Spring Web Flow, and the Servlet 3.0 specification. He also has worked for Packt Publishing reviewing the books Spring Web Flow 2 Web Development, Grails 1.1 Web Application Development, and Application Development for IBM WebSphere Process Server 7 and Enterprise Service Bus 7.



05 July 2012

Also available in Chinese Russian Japanese Portuguese

Enterprise developers tasked with managing complex work-flows, business rules, and business intelligence are quick to realize the value of an enterprise platform that integrates a work-flow engine, enterprise service bus (ESB), and rule engine. Until now, this sweet spot has been filled by commercial products like IBM WebSphere® Process Server/WebSphere Enterprise Service Bus (see Resources) and Oracle SOA Suite. Drools 5, from JBoss Community, is an open source alternative that seamlessly integrates a jBPM workflow engine and rule engine via a set of unified APIs and a shared, stateful knowledge session.

For beginners

Note that this article assumes that you are familiar with the Spring platform, Java Persistence API, and the fundamentals of Drools. See Resources for more introductory articles on any of these topics.

The Drools 5 Business Logic Integration Platform consists primarily of Drools Expert and Drools Fusion, which together comprise the platform's rule engine and infrastructure for complex event processing/temporal reasoning. The sample application for this article is built from these core features. See Resources to learn more about additional packages available in Drools 5.

POJOs in Drools 5

Plain old Java objects (POJOs) were first notably implemented in the Spring framework. POJOs, along with dependency injection (DI) and aspect-oriented programming (AOP), marked a return to simplicity that effectively boosted Spring as an industry standard for developing web applications. POJO adoption flowed from Spring into EJB 3.0 and JPA, and from there into XML-to-Java binding technologies like JAXB and XStream. More recently, POJOs were integrated into Lucene, the full-text search engine, via Hibernate Search (see Resources).

Today, as a result of these incremental advances, an application's POJO data model can be propagated over multiple tiers and exposed directly via web pages or SOAP/REST web service endpoints. As a programming model, POJO is both cost-effective and non-intrusive, saving developers time while simplifying enterprise architectures.

Now Drools 5 has taken POJO programming simplicity to its next level by allowing programmers to insert POJOs as facts directly into a knowledge session, or what a rule engine terms "working memory." This article introduces a cost-effective and un-intrusive approach that manipulates JPA entities as facts in Drools working memory. Continuous, real-time data profiling has never been so easy.

A Drools programming challenge

Many health care providers use a case-management system as a cost-effective way to keep track of medical records such as care, prescriptions, and evaluations. Our example program, based on such a system, has the following flow and requirements:

  • Cases are circulated among all of the clinicians in the system.
  • Clinicians are accountable for at least one evaluation task per week, or a notification is sent to the clinician's supervisor.
  • The system automatically schedules evaluation tasks for clinicians.
  • If a case has not been evaluated for more than 30 days, then a reminder is sent to all clinicians in the case group.
  • If there is no response, the system will take actions defined by the system's business rules, such as notifying the clinician group of the issue and proposing another schedule.

Choosing a business process management (BPM) workflow and rule engine for this use case makes sense: the system uses data profiling/analysis rules (italicized in the above list), each case could be treated as a long-running process/workflow in jBPM, and we could use Drools Planner for the automatic-scheduling requirement. For the purpose of this article, we'll just focus on the program's business rules. Let's also say that the system demands that reminders and notifications be generated instantaneously in real-time when a rule condition is met. It is thus a continuous, real-time data-profiling use case.

Listing 1 shows the three entity classes that are declared in our system: MemberCase, Clinician, and CaseSupervision:

Listing 1. Entity classes
@Entity
@EntityListeners({DefaultWorkingMemoryPartitionEntityListener.class})
public class MemberCase implements Serializable 
{
  private Long id; // pk
  private Date startDtm;
  private Date endDtm;
  private Member member; // not null (memberId)
  private List<CaseSupervision> caseSupervisions = new ArrayList<CaseSupervision>();
  //...
}
 
@Entity
@EntityListeners({DefaultWorkingMemoryPartitionEntityListener.class})
public class Clinician implements Serializable 
{ 
  private Long id; // pk
  private Boolean active;
  private List<CaseSupervision> caseSupervisions = new ArrayList<CaseSupervision>();
	//...
}

@Entity
@EntityListeners({SupervisionStreamWorkingMemoryPartitionEntityListener.class})
public class CaseSupervision implements Serializable 
{ 
  private Long id; // pk
  private Date entryDtm;
  private MemberCase memberCase;
  private Clinician clinician;
  //...
}

Each instance of MemberCase represents a patient case. Clinician represents the clinicians in the facility. A CaseSupervision record is yielded each time a case evaluation is conducted by a clinician. Together, these three entities are the fact types in the business rules to be defined. Also note that CaseSupervision above is declared as an event type in Drools.

From an application perspective, we could modify the entities of the three types from anywhere in the system, on different screens, and in different workflows. We could even use a tool like Spring Batch to batch-update the entities. For the sake of this example, however, let's assume that we'll update the entities solely through the JPA persistence context.

Note that the sample application is a Spring-Drools integration that uses Maven for builds. We'll look at some of the configuration details later in the article, but you can download the source zip anytime. For now, let's consider some conceptual features of working with Drools 5.

Fact and FactHandle

A general concept of rule engines is that facts are data objects that rules reason on. In Drools, facts are arbitrary Java beans that you take from the application and assert into the engine's working memory. Or, as it's written in the JBoss Drools reference manual:

The rule engine does not "clone" facts at all, it is all references/pointers at the end of the day. Facts are your applications data. Strings and other classes without getters and setters are not valid Facts and can't be used with Field Constraints which rely on the JavaBean standard of getters and setters to interact with the object.

Unless you have specified the keyword no-loop or lock-on-active on top of the rule, rules in the Drools rule engine will be re-evaluated any time a fact change occurs in working memory. You can also use @PropertyReactive and @watch annotations to specify fact properties that Drools should watch for changes. Drools will ignore updates on all other properties of the fact.

For the sake of true maintenance, there are three ways to safely update a fact in Drools working memory:

  1. In Drools syntax, the right-hand side (RHS) is the action/consequence section of a rule, which you can update inside a modify block. Use this approach when changing a fact as the consequence of a rule being activated.
  2. Externally through a FactHandle in a Java class; use for fact changes made by application Java code.
  3. Having the Fact class implement PropertyChangeSupport as defined by the JavaBeans specification; use this to register Drools as a PropertyChangeListener to the Fact object.

Being quiet observers, our rules will not update any JPA entity facts in Drools working memory; rather, they will generate logical facts as reasoning results. (See Listing 6 below.) Updating JPA entities in rules needs extra attention, however, as the updated entities may be in a detached state, or it might be that no transaction or a read-only transaction is associated to the current thread. As a result, the changes made on the entities won't be saved to the database.

Although fact objects are pass-by-reference, Drools (unlike JPA/Hibernate) is incapable of tracking fact changes made outside of rules. You can avoid inconsistent rule-reasoning results by using FactHandle to notify Drools about fact changes made in the application Java code. Drools will then re-evaluate the rules appropriately. FactHandle is the token representing your asserted fact object in working memory. It is how you will normally interact with the working memory when you wish to modify or retract a fact. In our sample application (Listing 2 and Listing 3), we use FactHandle to manipulate entity facts in working memory.

You could work around Drools's inability to track fact changes by implementing PropertyChangeSupport (which captures every change made on a bean's properties). Keep in mind, though, that you would then have to address the performance hit of frequent rule revaluations.

Using JPA entities as facts

You can insert JPA entities as domain data objects, via POJO facts, into Drools's working memory. Doing so allows you to avoid data modeling for the Value Object/DTO layer, as well as the corresponding transformation layer between JPA entities and DTOs.

While using entities as facts will simplify your application code, you will have to pay extra attention to entity-life cycle phases. Entity facts should be held in either managed (persistent) or detached state. You should never insert transient entities into the Drools working memory because they are not saved in the database yet. Likewise, you should retract removed entities from working memory. Otherwise your application database and rule engine's working memory will get out of sync.

So that brings us to the million-dollar question: How should we efficiently notify the rule engine about entity changes made in the application code through FactHandle?

Imperative programming versus AOP

If we tried to meet this challenge with an imperative programming mindset, we would end up invoking the insert(), update(), and retract() methods on the knowledge session next to the corresponding JPA API methods. This approach would be an invasive use of the Drools APIs, and it would leave spaghetti code in the application. To make things worse, updated (dirty) entities in JPA are automatically synchronized with the database at the end of a read/write transaction, without any explicit invocation to the persistence context. How would we intercept and notify Drools upon these changes? Another option, polling entity changes in a separate process, like typical business intelligence (BI) tools do, would keep core business functions clean, but it would be difficult, costly to implement, and the results would not be instantaneous.

The JPA EntityListener is a kind of AOP interceptor that is well suited to our use case. In Listing 2, we'll define two EntityListeners that intercept all changes made to the three types of entities in the application. This approach keeps an entity's life cycle in JPA constantly synchronized with its life cycle in Drools.

Within the entity-life cycle callback methods, we look up a FactHandle for the given entity instance, then update or retract the fact through the returned FactHandle depending on the JPA life cycle phase. If the FactHandle is missing, the entity is inserted as a new fact to the working memory for entity update or persist. Because the entity doesn't exist in working memory, there is no need to remove it from working memory when JPA delete is called. The two JPA EntityListeners shown in Listing 2 are for two different entry points, or partitions, to the working memory. The first entry point is shared between MemberCase and Clinician, and the second one is for the CaseSupervision event type.

Listing 2. EntityListeners
@Configurable
public class DefaultWorkingMemoryPartitionEntityListener 
{
  @Value("#{ksession}") //unable to make @Configurable with compile time weaving work here
  private StatefulKnowledgeSession ksession;   
   
  @PostPersist
  @PostUpdate
  public void updateFact(Object entity)
  {       
    FactHandle factHandle = getKsession().getFactHandle(entity);
    if(factHandle == null)
      getKsession().insert(entity);
    else
      getKsession().update(factHandle, entity);
  }        
		   
  @PostRemove
  public void retractFact(Object entity)
  {
    FactHandle factHandle = getKsession().getFactHandle(entity);
    if(factHandle != null)
      getKsession().retract(factHandle);
  }
 
  public StatefulKnowledgeSession getKsession() 
  {
    if(ksession != null)
    {
      return ksession;
    }
    else
    {
      // a workaround for @Configurable
      setKsession(ApplicationContextProvider.getApplicationContext()
        .getBean("ksession", StatefulKnowledgeSession.class));
      return ksession;
    }
  }
  //...
}
 
@Configurable
public class SupervisionStreamWorkingMemoryPartitionEntityListener
{ 
  @Value("#{ksession}")  
  private StatefulKnowledgeSession ksession;   
	
  @PostPersist 
  // CaseSupervision is an immutable event, 
  // thus we don’t provide @PostUpdate and @PostRemove implementations.
  public void insertFact(Object entity)
  {   
    WorkingMemoryEntryPoint entryPoint = getKsession()
      .getWorkingMemoryEntryPoint("SupervisionStream");
    entryPoint.insert(entity);
  }        
  //...
}

Like AOP, the EntityListener approach in Listing 2 keeps the system's core business logic clean. Note that this approach requires one or more Drools global knowledge sessions to be injected into the two EntityListeners. We'll declare a knowledge session as a singleton Spring bean later in the article.

Tip: Shared global knowledge sessions

The shared global knowledge session essentially makes the EntityListener approach suitable for system-wide, BI data profiling and analysis requirements. It isn't as well suited to user-specific processes and rule executions like those used for online shopping systems, where the knowledge session would typically be generated on-the-fly to process user-specific data and then dispose of it.

Initializing working memory

When the application is launched, all the existing records of the three entity types are pre-loaded from the database to the working memory for rule evaluation, as shown in Listing 3. From then on, working memory will be notified of any changes made to the entities through the two EntityListeners.

Listing 3. Initializing working memory and running Drools queries
@Service("droolsService")
@Lazy(false)
@Transactional
public class DroolsServiceImpl 
{
  @Value("#{droolsServiceUtil}")
  private DroolsServiceUtil droolsServiceUtil;
    
  @PostConstruct
  public void launchRules()
  {
    droolsServiceUtil.initializeKnowledgeSession();
    droolsServiceUtil.fireRulesUtilHalt();    
  }
   
  public Collection<TransientReminder> findCaseReminders()
  {
    return droolsServiceUtil.droolsQuery("CaseReminderQuery", 
      "caseReminder", TransientReminder.class, null);
  }
   
  public Collection<TransientReminder> findClinicianReminders()
  {
    return droolsServiceUtil.droolsQuery("ClinicianReminderQuery", 
      "clinicianReminder", TransientReminder.class, null);
  }
}  
 
@Service
public class DroolsServiceUtil
{
  @Value("#{ksession}")
  private StatefulKnowledgeSession ksession;
            
  @Async
  public void fireRulesUtilHalt()
  {
    try{
      getKsession().fireUntilHalt(); 
    }catch(ConsequenceException e) 
    {
      throw e;
    }
  }
   
  public void initializeKnowledgeSession()
  {  
    getKsession().setGlobal("droolsServiceUtil", this);
    syncFactsWithDatabase();
  }

  @Transactional //a transaction-scoped persistence context
  public void syncFactsWithDatabase()
  {
    synchronized(ksession)
    {       
      // Reset all the facts in the working memory
      Collection<FactHandle> factHandles = getKsession().getFactHandles(
        new ObjectFilter(){public boolean accept(Object object)
        {
          if(object instanceof MemberCase)
            return true;
          return false;
        }
      });
      for(FactHandle factHandle : factHandles)
      {
        getKsession().retract(factHandle);
      }

      factHandles = getKsession().getFactHandles(
        new ObjectFilter(){public boolean accept(Object object)
        {
          if(object instanceof Clinician)
            return true;
          return false;
        }
      });
      for(FactHandle factHandle : factHandles)
      {
        getKsession().retract(factHandle);
      }           

      WorkingMemoryEntryPoint entryPoint = getKsession()
        .getWorkingMemoryEntryPoint("SupervisionStream");
      factHandles = entryPoint.getFactHandles();
      for(FactHandle factHandle : factHandles)
      {
        entryPoint.retract(factHandle);
      }               

      List<Command> commands = new ArrayList<Command>();
      commands.add(CommandFactory.newInsertElements(getMemberCaseService().findAll()));
      getKsession().execute(CommandFactory.newBatchExecution(commands));

      commands = new ArrayList<Command>();
      commands.add(CommandFactory.newInsertElements(getClinicianService().findAll()));
      getKsession().execute(CommandFactory.newBatchExecution(commands));    
	 
      for(CaseSupervision caseSupervision : getCaseSupervisionService().findAll())
      {
        entryPoint.insert(caseSupervision);
      }  
           
    }
  }
 
  public <T> Collection<T> droolsQuery(String query, String variable, 
    Class<T> c, Object... args)
  {
    synchronized(ksession)
    {       
      Collection<T> results = new ArrayList<T>();
      QueryResults qResults = getKsession().getQueryResults(query, args);  
      for(QueryResultsRow qrr : qResults)
      {
        T result = (T) qrr.get("$"+variable);
        results.add(result);
      }       
      return results;
    }
  }
}

A note about fireAllRules()

Note that in Listing 3 we had the option to call fireAllRules() within each EntityListener's callback methods. I simplified this by invoking the fireUntilHalt() method just once inside an eager-loaded Spring bean's "@PostConstruct" method. The fireUtilHalt method is supposed to be invoked once in a separate thread (check out Spring's @Async annotation), after which it keeps firing rule activations until a halt is called. If there is no activation to fire, fireUtilHalt will wait for an activation to be added to an active-agenda group or rule-flow group.

I could have chosen to fire rules or even start processes in the application's Spring XML configuration file (shown below). However, I detected a possible thread-handling issue on the fireUntilHalt() method when I tried configuring it. The result was a "database connection closed error" when lazy-loading entity relationships during rule evaluation (see advanced topics).

Spring-Drools integration

Now let's take a moment to look at some configuration details of the Spring-Drools integration. Listing 4 is a snippet of the application's Maven pom.xml that includes dependencies for the Drools core, Drools compiler, and Drools Spring integration package:

Listing 4. Part of Maven pom.xml
<dependency>
  <groupId>org.drools</groupId>
  <artifactId>drools-core</artifactId>
  <version>5.4.0.Final</version>
  <type>jar</type>
</dependency>               
<dependency>
  <groupId>org.drools</groupId>
  <artifactId>drools-compiler</artifactId>
  <version>5.4.0.Final</version>
  <type>jar</type>
</dependency>
<dependency> 
  <groupId>org.drools</groupId> 
  <artifactId>drools-spring</artifactId> 
  <version>5.4.0.Final</version> 
  <type>jar</type> 
  <exclusions>
    <!-- The dependency pom includes spring and hibernate dependencies by mistake. -->	
  </exclusions>
</dependency>

Identity versus equality

In Listing 5, I configure a global stateful knowledge session as a singleton Spring bean. (A stateless knowledge session wouldn't work as a long-lasting session because it doesn't keep its state during iterative invocations.) An important setting to notice in Listing 5 is <drools:assert-behavior mode="EQUALITY" />.

In JPA/Hibernate, managed entities are compared with identity, while detached entities are compared with equality. Entities inserted into a stateful knowledge session quickly become detached from the JPA perspective, since a transaction-scoped persistence context, even an "extended" or "flow-scoped" persistence context (see Resources) is ephemeral compared with the lifespan of the singleton stateful-knowledge session. The same entity fetched through different persistence-context objects is a different Java object each time. By default, Drools uses identity comparison. Accordingly, when you look up the FactHandle on an existing entity fact in working memory through ksession.getFactHandle(entity), most likely Drools won't find a match. In order to match detached entities, we have to choose EQUALITY in the configuration file.

Listing 5. Part of the Spring applicationContext.xml
<drools:kbase id="kbase">
  <drools:resources>
    <drools:resource  type="DRL" source="classpath:drools/rules.drl" />
  </drools:resources>
  <drools:configuration>
    <drools:mbeans enabled="true" />
    <drools:event-processing-mode mode="STREAM" />
    <drools:assert-behavior mode="EQUALITY" />
  </drools:configuration>
</drools:kbase>
<drools:ksession id="ksession" type="stateful" name="ksession" kbase="kbase" />

See the application source code for more complete configuration details.

Drools rules

Listing 6 defines two complex event processing (CEP) rules. Besides the two fact types as JPA entities, MemberCase and Clinician, the CaseSupervision entity class is declared as an event. Each case-evaluation task by a clinician generates a CaseSupervision record. Once created, the record is unlikely to experience any ongoing changes.

The condition of the rule Case Supervision in Listing 6 tests whether there has been a case supervision on the case in the last 30 days. If not, the consequence/action part of the rule generates a TransientReminder fact (defined in Listing 7) and logically inserts the fact into working memory. The rule Clinician Supervision dictates that a clinician should have completed at least one case supervision in the last seven days; if not, the consequence/action part of the rule generates a similar TransientReminder fact, which (likewise) is logically inserted into working memory.

Listing 6. Case supervision rules
package ibm.developerworks.article.drools;

import ibm.developerworks.article.drools.service.*
import ibm.developerworks.article.drools.domain.*
 
global DroolsServiceUtil droolsServiceUtil;

declare Today
  @role(event)
  @expires(24h)
end

declare CaseSupervision
  @role(event)
  @timestamp(entryDtm)
end

rule "Set Today"
  timer (cron: 0 0 0 * * ?)
  salience 99999  // optional
  no-loop
  when
  then
    insert(new Today()); 
end

rule "Case Supervision"
  dialect "mvel"
  when
    $today : Today()
    $memberCase : MemberCase(endDtm == null, startDtm before[30d] $today)
    not CaseSupervision(memberCase == $ memberCase) 
      over window:time(30d) from entry-point SupervisionStream
    then
      insertLogical(new TransientReminder($memberCase, (Clinician)null, 
        "CaseReminder", "No supervision on the case in last 30 days."));
end
 
query "CaseReminderQuery"
  $caseReminder : TransientReminder(reminderTypeCd == "CaseReminder")
end
 
rule "Clinician Supervision"
  dialect "mvel"
  when
    $clinician : Clinician()
    not CaseSupervision(clinician == $clinician) 
      over window:time(7d) from entry-point SupervisionStream
  then
    insertLogical(new TransientReminder((MemberCase)null, $clinician, 
      "ClinicianReminder", "Clinician completed no evaluation in last 7 days."));
end
 
query "ClinicianReminderQuery"
  $clinicianReminder : TransientReminder(reminderTypeCd == "ClinicianReminder")
end

Note that the TransientReminder fact shown in Listing 7 is not a JPA entity but a regular POJO.

Listing 7. TransientReminder
public class TransientReminder implements Comparable, Serializable
{			
  private MemberCase memberCase;
  private Clinician clinician;
  private String reminderTypeCd;
  private String description;

  public String toString() 
  {
    return ReflectionToStringBuilder.toString(this);
  }

  public boolean equals(Object pObject) 
  {
    return EqualsBuilder.reflectionEquals(this, pObject);
  }

  public int compareTo(Object pObject) 
  {
    return CompareToBuilder.reflectionCompare(this, pObject);
  }

  public int hashCode() 
  {
    return HashCodeBuilder.reflectionHashCode(this);
  } 	
}

Facts versus events

Events are decorated facts with temporal metadata such as @timestamp, @duration, and @expires. The most significant difference between facts and events is that events are immutable in the context of Drools. If an event is subject to changes, the changes (described as "event data enrichment") should not affect the results of a rule execution. That is why we only monitor the @PostPersist entity life cycle phase in the EntityListener of CaseSupervision (see Listing 2).

Drools's support for the Sliding Windows protocol make events especially powerful for temporal reasoning. A sliding window is a way to scope events of interest as if they belong to a window that is constantly moving. The two most common sliding-windows implementations are time-based windows and length-based windows.

In the sample rules shown in Listing 6, over window:time(30d) suggests that CaseSupervision events created in the last 30 days are evaluated by the rule engine. Once 30 days have passed, the immutable events will never enter the window again, and Drools will automatically retract the events from working memory and the rules will be re-evaluated accordingly. Because events are immutable, Drools automatically manages the event life cycle. Events are thus more memory-efficient than facts. (Note, however, that you must set the event-processing mode to STREAM in a Drools-Spring configuration; otherwise temporal operators like sliding windows will not work.)

Working with declared types

Something else to note in Listing 6 is that the MemberCase fact (which is not event-typed) is also evaluated against a time constraint, as we only evaluate cases created more than 30 days previously. A case may be 29 days old today, but 30 days old tomorrow, which implies the Case Supervision rule must be re-evaluated at the beginning of each day. Sadly, Drools doesn't offer a sliding "today" variable. So, as a workaround, I added an event type called Today; this is a Drools declared type, or a data construct declared in the rule language rather than in Java code.

This special event type declares no explicit attributes at all, except an implicit @timestamp metadata, which is auto-populated at the moment a Today event is asserted into working memory. Another metadata, @expires(24h), specifies that a Today event expires 24 hours after assertion.

To reset Today at the beginning of each day, I also added a timer on top of the Set Today rule. This rule is activated and fired at the beginning of each day to insert a fresh Today event, which replaces the one just expired. Subsequently, the fresh Today event triggers the revaluation of the Case Supervision rule. Note, too, that without fact changes in a rule's conditions, the timer by itself cannot trigger a revaluation of the rule. The timer does not re-evaluate functions or inline evals either, because Drools takes these constructs' returns as time-constant and has their values cached.

When to use facts versus events

Understanding the difference between facts and events helps us more easily decide when to use each type:

  • Use events for scenarios when the data represents an immutable snapshot of the system state at a point of time or a duration, it is time sensitive and will quickly expire, or the amount of data is predicted to grow rapidly and continuously.
  • Use facts for scenarios where the data is more vital to the business domain, and when the data will experience ongoing changes that require constant revaluation of the rules.

Drools queries

Our next step is to extract the rule-execution results, which we do by querying the facts in working memory. (An alternative approach would be to have the rule engine pass the results to the application, by calling methods on globals, on the right-hand side of the rules syntax.) In this example, fact assertions and rule firing all happen instantaneously without any delay, ensuring that our queries in Listing 6 will return real-time reports. Because the TransientReminder facts are logically asserted, the rule engine will automatically retract them out of working memory when their conditions are no longer met.

Say that a reminder was generated on a specific case by the rule engine this morning. Subsequently, we executed the query "CaseReminderQuery" in Java code as shown in Listing 3, so that a reminder was returned and displayed to all the clinicians in the system. If in the afternoon a clinician completed an evaluation on the case and generated a new case-supervision record, this event would break the conditions for the reminder fact. Drools would then automatically retract it. We could confirm that the reminder fact was gone by running the same query immediately after the case evaluation was complete. Logical assertion keeps your reasoning results up-to-date and the rule engine running in a memory-efficient mode, much like events do.

The logical fact counter

Note that each logically asserted fact is accompanied with a counter, which is incremented every time an equal fact is asserted. If a rule among the rules that asserted the equal fact repeatedly is no longer held, the counter for the logical fact is decremented. When the counter reaches zero, the fact is automatically retracted.

Live queries put the icing on the cake. A live query stays open, creating a view of query results and publishing change events for the contents of the given view. That means a live query needs to run exactly once, and the resulting view is automatically updated with the ongoing changes published by the rule engine.

So far, you've seen that with just a little background in Drools, JPA, and Spring, it's not difficult to implement a continuous, real-time data profiling application. We'll conclude with some advanced programming steps that will harden our case management solution.

Advanced Drools programming

Managing relationships

An interesting constraint of FactHandle is that it is only tied to the current fact, but not the fact's nested relationships. Drools will be informed about changes made to MemberCase's id (though this never happens because the primary key is immutable), startDtm, or endDtm via its FactHandle in getKsession().update(factHandle, memberCase). It will not be informed about changes to the member and caseSupervisions properties when you invoke the same method, however.

Likewise, EntityListeners in JPA are not notified about changes to one-to-many and many-to-many relationships. This is due to the fact that the foreign key resides in the related table or the link table.

In order to connect to these relationships as updated facts, we could build recursive logic to get the FactHandle of each nested relationship. A better solution would be to place EntityListeners on all the entities, including link tables, that are engaged in the rule's conditions. We did this with Member and CaseSupervision, where changes are handled by each entity's own EntityListener and FactHandle (see Listing 2 and Listing 3).

Entity lazy loading during rule evaluation

Unless we have specified a knowledge-base partition (that is, parallel processing), rules will be evaluated in the same thread where a ksession.insert(), ksession.update(), or ksession.retract() is invoked. Fact assertions in Listing 2 and Listing 3 both occur in a transaction context, where a transaction-scoped JPA persistence context (Hibernate session) is available. This allows the rule engine to evaluate across lazy-loaded entity relationships. If a knowledge-base partition is enabled, we have to configure the entity relationships as eager loaded to prevent a JPA LazyInitializationException.

Enabling transactions

By default, Drools doesn't support transactions because it doesn't keep any historical snapshots of the data in working memory. This is an issue for our EntityListeners because life cycle callback methods are invoked after a database flush but before the transaction commit. What if a transaction were rolled back? In that case the entities in the JPA persistence context would become detached and inconsistent with the rows in the database tables, and so do the facts in working memory. The rule engine reasoning results would be no longer trustworthy.

Enabling transactions will make our case management system bullet-proof, by ensuring that the data in working memory and the application database are always synchronized, and that rule-reasoning results are always accurate. In Drools, with JPA and JTA implementations in place and a "drools-jpa-persistence" package in the classpath, a JPAKnowledgeService (see Resources) can be configured to create our stateful knowledge session. The entire stateful knowledge session with process instances, variables, and fact objects is mapped as the binary column of a row in the table "SessionInfo" with ksessionId as the primary key.

When we specify transaction boundaries in our application code through annotations or XML, the application-initiated transaction propagates to the rule engine. Whenever a transaction rollback occurs, the stateful knowledge session will restore to the previous state saved in the database. That maintains the consistency and integration between the application database and the Drools database. The singleton stateful knowledge session in memory should behave like REPEATABLE READ when accessed simultaneously from multiple JTA transactions; otherwise, the single SessionInfo entity instance could have blended state changes made from different transactions, breaking transaction demarcation. Note that as of this writing it is unconfirmed whether REPEATABLE READ is implemented by the transaction manager of the drools-jpa-persistence package.

Clustering

If our application were to run under a clustered environment, the previously described approach would fail out quickly. Each instance of the embedded rule engine would receive entity events taking place just on the same node, causing the working memories on different nodes to fall out of sync. We could remedy this problem with a universal remote Drools server (see Resources). Entity listeners on different nodes would publish all their events toward the centralized Drools server through REST/SOAP web services communications, and the application could then subscribe to the reasoning results from the Drools server. Note that the Apache CXF implementation of SOAP in Drools server doesn't currently support ws-transaction. I hope that it will be available soon, given the obligatory transaction requirements outlined for this real-world use case.

In conclusion

In this article you have had the opportunity to bring together some of what you already knew about POJO programming in Spring and JPA along with some new features available in Drools 5. I have demonstrated how to make smart use of EntityListeners, a global Drools session, and the fireUtilHalt() method to develop a POJO-based continuous, real-time data profiling application. You've learned both core Drools concepts such as working with subjects as facts-versus-events and how to write logical assertions, as well as more advanced topics and uses such as transaction management and extending a Drools implementation into a clustering environment. Please see the application source code to learn more about Drools 5.


Download

DescriptionNameSize
Sample code for this articlej-drools5-src.zip5KB

Resources

Learn

Get products and technologies

  • Download Drools 5: Drools Expert and Drools Fusion, used in this article, implement the rule engine and CEP framework.

Discuss

  • Get involved in the My developerWorks community. Connect with other developerWorks users while exploring the developer-driven blogs, forums, groups, and wikis.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Java technology on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Java technology, Open source
ArticleID=823950
ArticleTitle=Use Drools and JPA for continuous, real-time data profiling
publish-date=07052012