Level: Advanced Srinivasan Muralidharan (muralisr@us.ibm.com), Advisory Engineer, IBM David Z. Maze (dmaze@us.ibm.com), Senior Engineer, IBM
08 May 2008 The IBM® WebSphere® DataPower® SOA Appliances
multistep processing policy system is a key part of appliance configuration. Version
3.6.1 of the firmware includes a number of enhancements to multistep that provide
functionality familiar to programmers, including loops of actions, conditional
execution of actions, and the ability to execute actions in parallel. Explore how
you can combine the new features in multistep 3 to build an RSS feed aggregator.
Overview of the RSS
feed aggregator service (RFAS)
RFAS uses multistep 3 features to aggregate news feeds and, optionally, filter
the output using a given search criteria. The main features of the service are:
- RSS sources that can be dynamically provided as input.
- Search criteria that can be provided as input.
- Feeds that are obtained in parallel.
This service illustrates the following features of the programming model
introduced by multistep 3:
- Conditional action
- For-each action
- Event-sink action
- Parallel execution using asynchronous actions
- Results action fanning out to multiple URLs asynchronously
The XML firewall service consists of four rules, as shown in Figure 1.
Figure 1. Rules in the RFAS
PipeURLsinRequestRule is the main rule to which input of the form shown in
Listing 1 is sent via HTTP POST method.
Listing 1. URLs sent to the service
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:tns= "http://developerworks.ibm.com/ /ms3demo">
<soap:Body>
<tns:keyword>some search text</tns:keyword>
<tns:sheets>
<tns:url>some URL</tns:url>
<tns:url>some URL</tns:url>
</tns:sheets>
</soap:Body>
</soap:Envelope>
|
Both the keyword and url
elements are optional. If url elements are provided,
PipesRunDynamicURLs is executed. Otherwise, the service executes the
PipeFixedFeeds rule to get RSS feeds from two fixed sources. If
keyword is provided, the ProcessSearchParam rule is
executed to filter items retrieved from the news feed. In both cases, the choice
of which rule to execute is determined by conditional actions executing XPath
expressions to look for url elements and
keyword elements, respectively.
PipesRunDynamicURLs uses a for-each action to collect all the URLs into a context
variable. Then a results action uses this variable to fetch the feeds from those
URLs in parallel.
PipeFixedFeeds contains two results actions, each asynchronously fetching a
single RSS feed. An event-sink action waits for the two results actions to
complete.
The rest of this article details the implementation. In particular you'll learn
about some interesting aspects of multistep 3 that might not be obvious at a quick
glance.
Executing the main
PipeURLsinRequestRule rule
 |
Conditional actions and call
rules A conditional action can execute any action, including
another conditional action. By executing a Call-Rulefrom, a conditional action is
a powerful divide-and-conquer approach to programming a service. When a Call-Rule
completes, the execution proceeds to the next action in the calling rule. The
execution contexts in the called rule are available in the main rule after the
invocation. |
|
We will follow the implementation in the most natural way – tracing the flow of
input as it progresses through the service.
Look for the
url element in input --
PipeURLsinRequestRule_conditional_1
This conditional action executes one of two actions based on the evaluation of
the XPath expression
/*[local-name()='Envelope']/*[local-name()='Body']/*[local-name()='sheets']/*[local-name()='url'
and count(.) > 0].
If that XPath expression matches, then the SOAP body element contains at least one
URL reference, and a call action invokes the rule PipesRunDynamicURLs. Otherwise,
you call PipeFixedFeeds.
Aggregate the results from the
feeds -- PipeURLsinRequestRule_xform_1
The outputs from all the RSS feeds (either from the rule PipesRunDynamicURLs or
PipeFixedFeeds) are stored in contexts pipesout_1, pipesout_2, and so on. The
context variable var://context/loop/count contains the total number of these
contexts. The PipeURLsinRequestRule_xform_1 transform executes consol.xsl, which
loops over the pipesout_x variables and aggregates their contents into the
context mergedout.
Get the search condition --
PipeURLsinRequestRule_filter_1
If a keyword was provided in the input, this filter action sets up a variable to
hold the XPath expression with which to search over the aggregated output.
 |
Variables across rules This is
an example of where a variable can be set in one rule but used in another. The
XPath expression is used in the ProcessSearchParam rule. |
|
To search or not to search --
PipeURLsinRequestRule_conditional_0
This conditional action evaluates the XPath expression
/*[local-name()='Envelope']/*[local-name()='Body']/*[local-name()='keyword' and . != '']
to determine if the aggregated results need to be further filtered using the
keyword provided in the input. If a keyword was provided, the rule
ProcessSearchParam is executed. Otherwise, the catch-all XPath expression
/* evaluates to true and
executes the transform PipeURLsinRequestRule_conditional_0_refaction_xform_0,
which merely adds a header to the aggregated output.
Executing the
PipesRunDynamicURLs rule to process RSS feed URLs in the input
 |
Scope of for-each iterator
variables The for-each iterator variables are only valid in actions run
from within a for-each action; hence, geturls.xsl stores the loop count in a
global variable for later use. If a for-each action executed a call action, the
iterator variables would be visible to all actions within the called rule. If loop
actions are nested, the iterator variables return the state for the innermost
loop. |
|
This rule illustrates how multistep 3 allows dynamic asynchronous fetching of an
arbitrary number of URLs. While this example only uses HTTP as a wire protocol,
this functionality can be extended to any other protocol supported by the
appliance firmware.
Collect URLs from input
-- PipesRunDynamicURLs_for-each_0
This for-each action executes the transform
PipesRunDynamicURLs_for-each_0_refaction_xform, which executes geturls.xsl to
gather the url elements from input. The action loops
over the node set selected by the XPath expression
/*[local-name()='Envelope']/*[local-name()='Body']/*[local-name()='sheets']/*[local-name()='url'].
Two loop variables are associated with each for-each iteration. The
var://service/multistep/loop-iterator variable contains the current node from that
node set; var://service/multistep/loop-count contains the current iteration count,
starting at 1. Note the use of the two iterator variables in geturls.xsl.
Figure 2. For-each action
PipesRunDynamicURLs_for-each_0_refaction_xform uses the same context (tmpout) for
input and output. When run in a loop by the for-each action, this provides an easy
means to collect all data into a single context.
Prepare URLs for
execution -- PipesRunDynamicURLs_xform_12
The PipesRunDynamicURLs_xform_12 prepares the URLs for execution in a results
action. PipesRunDynamicURLs_xform_12 adds the results
element to the url elements resulting in the structure
shown in Listing 2.
Listing 2.
URLs for results action
<results>
<url>url_1</url>
<url>url_2</url>
….
…
</results>
|
PipesRunDynamicURLs_xform_12 completes processing by assigning the structure in
Listing 2 to var://context/urlvar/urls.
Listing 2 is an examples of a general form specified by the XSD definition in Listing 3, which can
be used to override attributes for individual URLs.
Listing 3. The general <results> form
<results mode="require-all" transactional="true" retry-interval="100"
asynchronous="true" multiple-outputs="true" >
<url input="'var://context/someinputcontext'" retry-count="2"
asynchronous="true" >url_1</url>
….
</results>
|
The attributes input,
retry-count, and
asynchronous on a url
element override the common value in the results action that applies to all the
URLs. The attributes on the results element override
corresponding attributes in the results action.
Execute the URLs --
PipesRunDynamicURLs_results_4
 |
Asynchronous results action: a mechanism
for fan out over arbitrary URLs The results action used this way is
very efficient: Data is posted to the URLs in parallel, thus maximizing the
throughput of the batch execution. This approach to collecting URLs and executing
them once can be a pattern for implementing fan outs of different kinds. Other
options, such as the Multi-Way Results Mode, provide ways to implement such
flexible patterns. |
|
The results action is revamped with many new features. You'll read about a couple
of these here, specifically the Multi-Way Results Mode and Use Multiple Outputs
options. The results action has always supported passing a node set containing a
list of URLs in a context variable, but prior to version 3.6.1 this could only
result in sending the same data to multiple destinations serially.
In version 3.6.1, an XML syntax adds options to send different data to each
target, and the network transactions are executed in parallel. The Multi-Way
Results Mode option controls how failures are handled. The default, Require All,
sends data to all targets in parallel and fails if any one of those targets is
unreachable. Use Multiple Outputs is a toggle switch that, when on, creates a
separate output context for each target URL with the provided name suffixed with a
distinct number, starting at 1 for the first URL.
 |
Use common output context names: a neat
way to help with modular development The action that aggregates
results from RSS feeds (see
Aggregate the results from the feeds --
PipeURLsinRequestRule_xform_1)
is unaware if the feeds are obtained dynamically or using fixed feeds. It expects
a total count of the number of feeds to be in var://context/loop/count and the
pipesout_x contexts to contain the results appropriately. In the dynamic
URL execution path, the Multi-Way Results Mode creates multiple pipesout_x
contexts. To mimic that behavior, the results actions in the fixed feeds have
their outputs set to pipesout_1 and pipesout_2, respectively. As for the loop
count variable, the first set variable action sets it to 2. Also note that these
variables are available in the called rule.
Complex branching logic
The asynchronous action followed by a event-sink action is a frequently used
pattern in implementing complex processing logic found in process flow languages
such as Business Process Execution Language (BPEL). From the Web GUI, users can't
make an event-sink action wait for an action from a different rule. This limitation can
be overcome by adding the appropriate configuration statements directly into the
configuration, say using the command-line interface (CLI), to let event-sink
actions wait
for asynchronous actions from other rules to complete. This allows the
implementation of complex branching logic. The implementer needs to know the
internal name of the action (readily viewable in the CLI) and needs to make sure
that the processing orchestration doesn't lead to a situation where the box can
wait up to the event-sink timeout for an action that never starts. |
|
Figure 3. Options in the results
action available in multistep 3
Executing rule
PipesFixedFeeds to process preconfigured RSS feeds
The PipesFixedFeeds illustrates the implementation of parallel executions using
asynchronous actions (that is, actions with the asynchronous flag on) and the
event-sink action. Prior to version 3.6.1, there was only one asynchronous action,
results-async, but multistep 3 allows any action to be marked as asynchronous,
causes outstanding asynchronous actions to not block completion of the rule as a
whole, and provides a new event-sink action, which waits for a list of named
outstanding asynchronous actions.
Figure 4. Event-sink on two
asynchronous results actions
Each results action executes a single URL specified in the Destination field.
Unlike the Multi-Way Results Mode action examined in the previous section, these
results actions take in the URL directly instead of naming a context variable
containing URLs.
Executing the
ProcessSearchParam rule to process filter-aggregated feeds
 |
Use dynamically constructed XPath
expressions Unlike the previous example of for-each action, the XPath
field points to the context variable var://context/search/var instead of a
hard-coded expression. This is because the XPath expression has to be constructed
dynamically using the input keyword element (see Get the search condition --
PipeURLsinRequestRule_filter_1). The combination of
dynamically constructing an XPath expression along with executing complex actions
under conditional or for-each actions provides a powerful development tool to
minimize XSLT programming. |
|
This last rule is also executed conditionally from PipeURLsinRequestRule. It uses
the keyword input to filter the aggregated news feeds.
Filter using supplied
keyword -- ProcessSearchParam_for-each_0
This for-each action collects all the news items that contain the supplied
keyword in the title element
of the news feeds. As in the previous example of the for-each action (see Collect URLs from input
-- PipesRunDynamicURLs_for-each_0), the same Input and Output contexts are
used along with the loop variables var://service/multistep/loop-iterator
and var://service/multistep/loop-count to collect matching news items into
a single output context.
Conclusion
The enhanced multistep 3 functionality in the WebSphere DataPower SOA Appliances
V3.6.1 firmware
introduces some new programming verbs—for-each, conditional, and
event-sink—that bring it closer to process-oriented
languages like BPEL. These make WebSphere DataPower SOA Appliances development more modular and less
XSLT-intensive. The RSS aggregator described in this article uses simple
stylesheets, each less than 30 lines, including standard declarations. In addition
to such language enhancements, performance enhancements are possible by
using asynchronous behavior to run actions in parallel. Most importantly, you can
use all
these features seamlessly as building blocks for complex
applications.
Download | Description | Name | Size | Download method |
|---|
| DataPower config file1 | ms3dpconfig.zip | 370KB | HTTP |
|---|
Note - The file ms3dpconfig.zip contains the WebSphere DataPower SOA Appliances
configuration for the RSS feed aggregator sample. Import the file into a WebSphere
DataPower SOA Appliances device to use the example.
Resources Learn
Get products and technologies
- Innovate your next development project with
IBM trial software, available for download or on DVD.
Discuss
About the authors  | 
|  | Srinivasan Muralidharan is a developer at IBM WebSphere Technology Institute. He is interested in all aspects of SOA, in particular middleware integration, ESB technologies, and performance. |
 | 
|  | David Maze is an engineer in the IBM WebSphere DataPower XML Technology group.
His major WebSphere DataPower SOA Appliances projects have included rewriting the
XML Schema validation engine, software integration for the XG4 XML accelerator,
and work on the WebSphere DataPower SOA Appliances rule-execution engine. |
Rate this page
|