Tracing, logging, and error handling in mediation modules using IBM Integration Designer, WebSphere ESB, WebSphere Process Server, and Business Process Manager Advanced Edition, Part 2

Strategies for designing error handling in web service mediations

This article describes different error handling techniques in mediation modules created with IBM® Integration Designer V7.5. The runtime capabilities apply equally to mediations running in WebSphere® Enterprise Service Bus, WebSphere Process Server, and Business Process Manager Advanced Edition V7.5.

Share:

Frank I. Toth (ftoth@us.ibm.com), Software Engineer, IBM

Frank Toth has 4 years of experience as a Software Developer at IBM and 6 years in customer support. His positions include resolving software-related problem reported by customers using WebSphere Integration Developer.



Dave Screen (dave.screen@uk.ibm.com), Software Engineer, IBM

Photo of Dave ScreenDave Screen is a Software Engineer and currently works as a BPM Consultant in IBM Software Services for WebSphere (ISSW) at the IBM Hursley Software Lab in the United Kingdom. He has 9 years of experience that includes areas in WebSphere, security, transactions, Java runtimes, and Eclipse.



Callum Jackson (callumj@uk.ibm.com), Software Engineer, IBM

Photo of Callum Jackson Callum Jackson is a Software Engineer on the WebSphere ESB team at the IBM Hursley Software Lab in the United Kingdom. He has worked on the WebSphere ESB team since 2005, and before that he worked in Software Services on SOA applications for the telecommunications industry.



24 August 2011

Also available in Chinese

Introduction

A mediation module is created with IBM Integration Designer V7.5 (hereafter called Integration Designer) and deployed on WebSphere Enterprise Service Bus (ESB) or Business Process Manager (BPM) V7.5 Advanced Edition. WebSphere ESB supports the transformation and routing capabilities required from enterprise services. ESB solutions often perform message and protocol transformations to adhere to the interface definitions defined by the service providers and clients.

Part 1 of this article defined the available mediation primitives (hereafter referred to as primitives and named with uppercase, such as “Trace primitive”) and runtime features, and which are best suited to logging or tracing. In some cases, there is a grey area because you can use a primitive for either purpose.

This article describes different ways to handle errors in mediation flows. With the exception of the new Error flow capability in V7.5, this article applies equally to version 7 of WebSphere Integration Developer, WebSphere ESB, and WebSphere Process Server. Mediation flow components inside modules and mediation modules can also be deployed in BPM Advanced Edition.

When a mediation flow is implementing logic or calling existing providers, there are choices about how to handle failures. These failures can be “modeled”, for example, declared faults on the WSDL interfaces, or “unmodeled”, for example, system failures. This article explores various ways to handle errors in the mediation flow, enabling the right design and configuration.

When a technique is considered a leading or common practice, it is shown in a sidebar. These recommendations do not fit every situation and most of the article describes what is possible so that you can choose appropriately for your needs.


Requestors, mediations, and providers

Error handling by mediation components can typically involve transforming the provider (commonly referred to as “backend” or “service provider”) error messages into well-defined messages defined in the context of the business domain. The mediation exposes a service to the requestor as depicted in Figure 1. These components can also handle provider system errors (for example, network unavailable) and provide more simplified error information to the requestor (or “client”). For terminology and other ESB discussion, see this article, The Enterprise Service Bus, re-examined.

Figure 1. Mediations typically sit between a client and an existing service provider
Mediations typically sit between a client and an existing service provider

Each mediation module can contain a number of mediation flow components. Each mediation flow component can implement a number of operations defined on the WSDL interfaces. Each implementation of an operation is called a mediation flow. During execution of a mediation flow (a “mediation flow instance” or “flow” for short), different types of failures can occur.


Synchronous web services scenario

The focus of this article is synchronous scenarios, such as web services integrations (see Common patterns of usage for error handling). This is where both the Export and Import in Figure 2 use a web service binding.

Figure 2. Mediation flow component connected to the outside world with Imports and Exports
Mediation flow component connected to the outside world with Imports and Exports

For this article, we will use the term provider when talking about what is being invoked through an Import (Figure 2). We will use the term mediation to describe the application that is created by implementing a mediation flow inside a mediation flow component. The Web Service Export is exposing a physical web service endpoint for the requestors.

The examples of messages and behavior are specific for the web service binding, for example the use of SOAP as the payload format. That said, most of the patterns for designing mediation flows described here apply to the choice of bindings (for example, SCA) or synchronicity. This includes error handling sub-flows, retry of invocations, primitives, and modeling faults.

The Resources section contains a number of references to material discussing asynchronous scenarios, which include messaging integrations. Although the design of error handling inside mediation flows is usually the same, the difference is often what happens when a flow fails. For an asynchronous invocation of a mediation, a Failed Event is generated (depending on configuration), instead of a failure response being sent to the requestor. Transactionality is beyond the scope of this article, apart from the section on forcing a transaction rollback.

The runtime has additional features that are not covered here, such as Store and Forward and Failed Event Manager. You can use these to halt (and then later resume) processing when a provider is unavailable and to modify and replay messages. These only apply to mediations that are invoked asynchronously, such as by consuming a message from a JMS queue.


Reasons for error handling

This section describes different reasons to perform error handling.

Implementing error processing policy

Errors experienced by mediations such as network unavailability of providers or other runtime exceptions can be transformed into simple error messages. If native components do not log these errors, then you can design mediation layers to log the root cause while returning a suitable response message to the requestor, informing about the temporary service unavailability.

Handling provider implementation specific error codes

Service providers can return errors using different approaches. This can involve anything from SOAP faults to proprietary structures inside message headers or bodies. Appropriate transformation rules can be applied to return errors in a consistent manner that is independent of the target service provider.

Handling sensitive information

When errors occur, messages often contain sensitive information, such as protocols used, server IP addresses, and so on. Appropriate rules need to be applied to filter any sensitive information before tracing or creating a response message.


Error handling features in mediation flows

This section describes modeling and handling failures inside mediations.

Note

We recommend that you declare exception conditions that relate to the service operation itself, such as an invalid customer identifier as part of the WSDL interface.

Overview of faults

Review the Information Center topic, Error handling in the mediation flow component, for a description of the capabilities. First, we will briefly recap on the concepts.

When working with providers or defining mediations, we use WSDL interfaces. These can declare inputs, outputs, and faults. The faults declared on WSDL interfaces correspond to known error cases that are defined as part of the service interface.

Modeled and unmodeled faults

There are two types of faults, modeled and unmodeled (see Figure 3). Modeled faults correspond to errors defined on the WSDL interface. Unmodeled faults are errors that are not declared on the WSDL interface.

Figure 3. WSDL interface with two declared faults
WSDL interface with two declared faults

Conversely, for most interface definitions it is not necessary, advised, or possible to model the set of system issues related to its implementation (for example, network unavailable). The requestor or the system will deal with these issues. However, when working at the ESB technical integration layer, there are cases where modeling a provider system issue as a declared fault on the interface will be the intention. An example is an aggregator that shields the caller from provider failures.

When implementing mediations, it is important to handle faults caused when invoking providers. This is done by wiring nodes and primitives, which will be described in the following sections (a complete list is provided in the Summary of faults section). It might be necessary to create faults as part of the service interface being exposed. In contrast, this is done by wiring to certain nodes and primitives.

Working with modeled faults

We will look at designing modeled faults and then how to use them.

Designing modeled faults

There might be a number of different “non-happy path” situations where a fault message needs to be returned to the requestor. How these are grouped might be pre-determined by a contract for the service being implemented. If not and where possible, group the failures into a small number of categories that make sense for the requestors to handle. The reason is that as the number of fault message types increase, so does the amount of logic required in the service and in the requestors.

There might be information common to all fault messages and this can be put in a common business object definition (XSD), as shown in Figure 4.

The fault types defined in the interface from Figure 3 both extend a common type since they share common failure related data.

Figure 4. Defining Business Object types that are used in fault messages
Defining Business Object types that are used in fault messages

The runtime uses the type information to distinguish between faults when working with web services.

Notes

  • For best results and maximum compatibility between bindings and runtimes, declare a distinct type for each fault message.
  • Using a single input and output parameter for each operation is also common practice.

Also note that the interface in Figure 3 consolidates all input and output parameters into two business object types. The UpdateCustomerRequest type contains two fields for customer id and detail. Like the fault types, these input and output types can also inherit from common request or response type definitions.

Creating and sending modeled faults to the requestor

To send a modeled fault response, the flow must reach an Input Fault node, either on the request or response. For the interface in Figure 3, there are two terminals available with message types corresponding to the declared faults. The yellow pop-up in Figure 5 is displayed by clicking the small “i” symbol (circled in blue).

Figure 5. Terminals on Input Fault node to create modeled faults
Terminals on Input Fault node to create modeled faults

The message body (/body of the SMO) must have the appropriate type for that fault message. This can be done with an XSL primitive and is demonstrated in Figure 6 with the invalidCustomerId fault. In the map, the three output fields are populated and two of these are inherited from the FaultCommon XSD type. The id field takes the value directly from a customerId input field from the request message. The reason field’s value is built using some text and the same customerId field using a Custom XPath expression of concat('Invalid customer Id : ',$customerId). The code field receives a constant value.

Figure 6. Wiring and map to create a modeled fault
Wiring and map to create a modeled fault

Testing this service with the Generic Service Client (available on the Web Services context menu when selecting the Web Service Port for the Export), or using the TCP/IP Monitor view can reveal the SOAP message for the modeled fault.

The HTTP status code is 500 for any modeled fault. Returning a modeled fault does not force the transaction (if any) to a rollback.

Listing 1. Example SOAP Fault message for invalidCustomerId fault (with most of the type declarations omitted)
<soapenv:Envelope>
    <soapenv:Body>
        <soapenv:Fault>
            <faultcode>m:Server</faultcode>
            <faultstring>Invalid customer Id : 12345</faultstring>
            <detail>
                <io7:operation1Fault1_invalidCustomerId
                 xmlns:io7="http://Common/ProviderInterface">

                    <reason>Invalid customer Id :  12345</reason>
                    <code>INVALID_ID</code>
                    <id>12345</id>
                </io7:operation1Fault1_invalidCustomerId>
...

Note that the fault type in the message body must be populated with a non-nill. There is something called an xsi:nill value for the correct fault message to be generated by the Web Service binding. This means assigning at least a single item of data in a complex type (like above in Listing 1), or populating a value into a simple type (for example, if the fault type is xsd:string).

Handling modeled faults from multiple providers

The faults defined by service providers might vary in style or naming and also can be different from those that need to be returned from the service. Figure 7 shows a flow that might receive a fault from two different providers, depending on which one is chosen to call.

Figure 7. Mapping faults from different provider interfaces
Mapping faults from different provider interfaces

Also, the fault messages must be translated to the format required by the service being exposed. The map shown in Figure 8 maps two equivalent fields and assigns a constant value to the code field.

Figure 8. Mapping fields from provider fault to service response fault message
Mapping fields from provider fault to service response fault message

The map above has its root set as /body because for modeled faults, the information is contained in the SMO body. The SMO trace for the Callout Fault flow resulting from receiving the SOAP message in Listing 2 is listed below to demonstrate how the body is populated from the SOAP fault detail. The faultcode in /headers/SMOHeaders/SOAPFaultInfo is also populated and this matches the body message type.

Listing 2. SMO trace (simplified) for invalidCustomerId fault received
<smo>
  <context/>
  <headers>
...
    <SMOHeader>
      <MessageType>Exception</MessageType>
    </SMOHeader>
    <SOAPFaultInfo>
      <faultcode xmlns:ns1="http://Common/ProviderInterface">
          ns1:updateCustomer_invalidCustomerIdMsg</faultcode>
    </SOAPFaultInfo>
  </headers>
  <body xmlns:ns0="wsdl.http://Common/ProviderInterface" 
     xsi:type="ns0:updateCustomer_invalidCustomerIdMsg">
    <io7:operation1Fault1_invalidCustomerId xmlns:io7="http://Common/ProviderInterface" >
      <reason>Invalid customer Id : 12345</reason>
      <code>INVALID_ID</code>
      <id>12345</id>
    </io7:operation1Fault1_invalidCustomerId>
  </body>
</smo>

Observe that in the above mediation flow in Listing 2, two of the Callout Fault terminals are unwired. This behaves the same as wiring to a Stop primitive.

Note

If the intention is to stop the flow after an output or fault terminal, then explicitly wiring the terminal to a Stop primitive will improve clarity.

However, a sticky note documenting the flow probably has the same effect. Also, the fail terminals of the three primitives are unwired. There is further discussion below on the Stop primitive and using Error flows to handle unwired fail terminals.

Errors inside mediation flows

Mediation primitives that process messages have a fail terminal (the rectangular shaped terminal at the bottom right of the primitive), which propagates exception information along with the input message when there is a failure in that primitive. The exception information is stored in the failInfo element in the message context. You must wire the fail terminal of a mediation primitive to another primitive to access failInfo, as shown in Figure 9. The fail terminal of a primitive provides failure information about the current primitive, most notably the failureString field providing a message.

Figure 9. The failInfo structure
The failInfo structure

The origin field gives the primitive name, and invocationPath gives the list of previous primitives flowed through leading up to the failure.

Simple error handling – Stop primitive

You can implement simple error handling by using some of the provided tracing and logging primitives. For example, if a primitive fails during the input process due to incorrect input data or some other processing failure of the primitive, simply logging the input data and stopping the flow might be sufficient.

Figure 10. Using a Stop primitive when a mapping failure occurs
Using a Stop primitive when a mapping failure occurs

In the flow shown in Figure 10, when a mapping failure occurs, the input message is logged in the database and then a Stop primitive is used. For a web services interaction, this has the effect of sending back a SOAP response with an empty body (HTTP status 200), assuming no other path continues to send a response. Using a Stop primitive does not cause a transaction rollback.

Note

Where possible, formulate and return a meaningful response rather than halting the flow with a Stop primitive.

Note that wiring an output terminal (such as from the Message Logger) to a Stop primitive has the same effect as not wiring it at all, but it is good practice to be explicit with the wiring.

Consolidating an error logic with a sub flow

The “any message type” terminal type means that this sub flow is used for any WSDL message (see Figure 11).

The sub-flow appears just like a regular primitive when instantiated in a flow.

Figure 11. Sub flow definition that can work with any message type
Sub flow definition that can work with any message type

Note

If an equivalent logic needs to be repeated in many places, either in the same flow, same mediation component, or even in separate modules, then consider a sub flow. You can place a sub flow in a library where it can be used by multiple projects.

You can promote the properties of each primitive in the sub-flow. In the above example, you can promote the enabled property of the Trace and Message Logger primitives to allow the administrator to choose file or database logging for each sub flow instance.

By default, the level of granularity is very fine, but the flow designer can choose to reduce this by grouping related properties. This is used to reduce the complexity of administration. Promoted properties across all sub flow instances (such as the enabled property of the Trace primitive) can be given the same group and alias by the designer of the flow. This means that an administrator can disable all instances of the embedded Trace primitives by configuring a single promoted property value on the module in the Integrated Solutions Console.

Error flows (new in V7.5)

You need to wire fail terminals unless it is desired to have the default technical error message related to the failed primitive returned to the caller. This is equivalent to using a Fail primitive (discussed later). You can use the Error flow (new in V7.5) to automatically “wire” each unwired fail terminal.

The terminal is set to “any type” because the message can have any WSDL operation type. A generic logging flow (or sub-flow) is used to handle all failures.

If the desired behavior depends upon which part of the flow failed, then use a Type filter primitive to choose an appropriate action as shown in Figure 12. This might be practical if there are a small number of message types that need special handling.

Figure 12. Handling multiple failures in an Error flow
Handling multiple failures in an Error flow

Alternatively, the data inside the SMO is inspected to determine the action. For example, consider if the results from the invocations are stored in one of the contexts by the request flow and behavior needs are chosen based on this data. A Message Filter primitive in conjunction with XPath conditions is used to select an appropriate flow to handle that failure. See the next section and Table 2 in the Summary of faults section for further information about the differences between Stop and Fail primitives.

Note

It is common practice to wire every fail terminal in mediations by using an Error Flow or a combination of the two approaches.

Working with unmodeled faults

This section will describe how to create and process unmodeled faults.

When invoking providers, if there is a failure that is not modeled (declared on an interface) - such as an unhandled runtime error - then this typically results in receiving an unmodeled fault. However, this depends on how the provider is implemented.

An unmodeled fault represents an error condition and left unhandled, it propagates to the caller (for a Web Service Export) and causes a transaction rollback.

The unmodeled fault is considered a runtime error and produces a stack trace in the logs. If used for normal or expected operation, then this can have a performance impact.

Note

We recommend to limit the creation of unmodeled faults to true failure conditions.

Creating using the Fail primitive

A common way for unmodeled faults to get returned by mediations is by not wiring fail terminals on primitives.

You can use the Fail primitive to return an unmodeled fault and provide a user-supplied error message (Figure 13). See Table 2 in the Summary of faults section for a comparison with the Stop primitive.

Figure 13. Fail primitive used to return an unmodeled fault with a customer error message
Fail primitive used to return an unmodeled fault with a customer error message

A custom error message is defined and the placeholder {4} is populated with data from the SMO at the XPATH location /body/updateCustomer/id. The default Root path is /context/failInfo, which contains the failure information from the previous primitive. Other placeholders are available and documented in the Information Center topic, Fail mediation primitive.

Listing 3. Example SOAP fault message representing unmodeled fault
<soapenv:Envelope>
  <soapenv:Body>
    <soapenv:Fault>
      <faultcode>m:Server</faultcode>
      <faultstring> javax.xml.ws.WebServiceException: 
       com.ibm.websphere.sca.ServiceRuntimeException: CWSXM0201E: Exception returned by 
       mediation flow for component Unmodeled in module Diagrams: CWSXM3300E: Fail 
       primitive 'Map_Fail', component 'Unmodeled', module 'Diagrams', interface 
       '{http://Common/ProviderInterface}ProviderInterface', operation 'updateCustomer', 
       raised a FailFlowException. The user-supplied error message is 'Map failed for
       customer Id 12345'...

The above SOAP in Listing 3 illustrates the user-supplied error message being populated. The HTTP status code is 500.

It is not possible to precisely control the format (such as faultstring) of the SOAP message using the Fail primitive or using modeled faults. If this is required, then use a JAX-WS handler to modify the SOAP message at the boundary. Alternatively, use a service gateway (discussed later) to manually construct the SOAP envelope.

Handling unmodeled faults and sending modeled faults for the requestor

The flow in Figure 14 shows an unmodeled fault being handled by the wiring from the Service Invoke primitive’s fail terminal. This originated because of the following reasons:

  • Failure to send the request or receive the response (for example, HTTP status 404 – URL not found).
  • Response that is not SOAP (for example, plain text and HTTP status 500).
  • SOAP response that does not match the expected response format.
  • Valid SOAP containing a fault that is not modeled (for example, unmodeled, HTTP 500).
Figure 14. Converting an unmodeled fault into a modeled fault
Converting an unmodeled fault into a modeled fault

The “Provider error” map creates a message suitable for one of the two Input Fault terminals, each representing a modeled fault. The particular modeled fault represents a provider failure.

The failure information for unmodeled faults and fail terminals is contained in the XPATH location /context/failInfo in the SMO. In the example shown in Figure 15, the failureString is copied into a field in the output map and sent back as part of a modeled fault using the Input Fault terminal.

Figure 15. Mapping from the failInfo structure
Mapping from the failInfo structure

The SOAP fault message received is logged in /context/failInfo/failureString as a string. An alternative to returning the complete SOAP fault message to the caller might be to log it in a database and return a message relevant for the requestor.

Note

Although not appropriate in every situation, a typical approach is to return a combination of human readable text and an error code. The error code is pertinent in situations where users work in multiple written languages.

The behavior of the Callout Response node is the same as the Service Invoke primitive.

By handling the unmodeled fault, the failure by the provider does not cause a transaction rollback. The provider failure message is mediated by design. If there is a requirement to preserve the SOAP format, then it is necessary to use a service gateway (discussed later).

Summary of faults and stop and fail primitives

This section gives an overview of the behavior of faults, and the stop and fail primitives.

Table 1. Differences between modeled and unmodeled faults
Type of faultModeledUnmodeled
Defined in WSDL Yes No
Payload Well-defined, in Service Message Object (SMO) body Undefined, in SMO header
Typical use Business or user-defined System-orientated
Java equivalent e.g. seen in logsServiceBusinessExceptionServiceRuntimeException
Handle, by wiring from Named terminals for each fault on Callout Fault node or Service Invoke primitive Fail terminals on Callout Response node or Service Invoke primitive
Create, by wiring to Named terminals on Input Fault node Fail primitive or system failure, such as a network
Effect if not handled Like a Stop primitive, flow or branch ends without error Like a Fail primitive. Propagates to caller. Transaction rollback
Use for Regular “non-happy path” exception conditions Only when needed, for failure conditions
Table 2. Differences between Stop and Fail primitives
PrimitiveStop Fail
Use To end a flow or branch of a flow Only when needed, for failure conditions
Effect on flow when mediation driven by Web Service Export Terminates the current branch. If it is the only branch, then the SOAP response message with empty body is created. Terminates all branches and an unmodeled fault SOAP response is created.
Same effect as An unwired fault or output terminal An unwired fail terminal
Effect on transaction None Rollback
Causes an exception/fault No ServiceRuntimeException / unmodeled fault

Advanced topics

This section describes topics that are helpful in more challenging integration scenarios.

Retrying a web service invocation

WebSphere ESB has built-in capability to retry failed service invocations. The Service Invoke primitive and Callout nodes have options to retry on modeled faults or unmodeled faults up to a specified number of times (see Figure 16).

Figure 16. Configuring retry on Service Invoke primitive
Configuring retry on Service Invoke primitive

The properties here are all promotable. That means that they can be exposed for an administrator to modify dynamically while the application is running.

In normal operation, the total time spanned by the request and response of the mediation flow should not exceed the transaction timeout defined on the server (default 120s). This includes invocations, retry delays, and other processing in each mediation flow instance.

Note

Where possible, design mediation flow instances to be short-lived. This reduces resource contention such as thread usage and database connections.

Using alternative endpoints

If you like alternative endpoints (URL in the case of web services) to be used upon failure, then the SMO must be populated with them before the invocation (Figure 17).

Figure 17. Setting alternative endpoints in the SMO before invocation
Setting alternative endpoints in the SMO before invocation

The runtime cycles through the endpoints, starting with the default /headers/SMOHeader/Target/address, and then the “alternate endpoints” until success or the retry count is reached.

A standard approach is to use Endpoint Lookup primitive in conjunction with WebSphere Service Registry and Repository to retrieve endpoints.

Retry an invocation after a specific modeled fault

Sometimes the built-in capability is not enough for your requirements. An example is when you want to retry an invocation after receiving a particular modeled fault (for example, one that describes a temporary or unknown provider failure), but not other modeled faults.

You can use the Fan Out primitive to loop until a condition is reached, or up to a specified number of times. In this example, it is used to retry invocations.

In this example, let us say that we want to retry any unmodeled faults and a specific modeled fault UnknownSystemException declared in the WSDL interface for the provider that represents a possibly transient provider error. We will have up to three invocations, that is, two retries.

In Figure 18, a retry occurs for the two highlighted paths. The top path is for the UnknownSystemException modeled fault, and the lower of the top two is for unmodeled faults received.

Figure 18. Using the Fan Out primitive in iterate mode to implement a retry
Using the Fan Out primitive in iterate mode to implement a retry

The retry paths are wired to the “in” input terminal of the Fan In primitive. This is the terminal that causes the Fan Out to fire again and to loop round. If three iterations have been reached and the flow reaches the “in” terminal of the Fan In primitive, then the “out” terminal is fired. In this scenario, this means that there have been three unsuccessful invocations.

The other paths (OK response and the remaining modeled “Business” faults) are wired to the “stop” terminal of the Fan In primitive. This terminal causes the Fan Out iteration to cease (before or at the total number of iterations), and the “incomplete” output terminal is fired. In this scenario, that corresponds to a successful invocation or retry attempt.

There is a bit of “bootstrapping” needed for the above example to work. This is because the Fan Out primitive’s iteration capability is designed to only loop over an input array (for an aggregation scenario). To use it for the purpose of looping (such as up to three times), it is necessary to construct an array of length three for input to the Fan Out primitive. The steps to make the example work are:

  1. Declare a BO with a field loop that is a list or array of string (or any type).
  2. Set this BO as the type of the transient context.
  3. In Properties, set the Fan Out to fire the output terminal for each element in the XPath expression /context/transient/loop. Set the Fan In to fire the output terminal when the associated Fan Out primitive has iterated through all messages.
  4. Use a Message Element Setter to populate the loop field with as many array elements as the maximum number of invocations needed (Figure 19). This needs to be before the Fan Out primitive.
Figure 19. Populating an input array that drives Fan Out iteration
Populating input array that drives Fan Out iteration

The behavior of the Fan Out loop can be made more dynamic with promoted properties (for example, total retries). The loop array just needs to be populated (Java™ code also works) with the absolute maximum number of invocations for any configuration.

Another option is to wire the retries directly so that there are multiple Service Invoke primitives on the flow (number of retries + 1). This gets messy after one retry.

Error handling gateways

In some cases, the particular format of messages from providers is not relevant to the ESB. It is possible that all failure messages (HTTP code 500) should be handled similarly and the Service Gateway pattern (open the Patterns Explorer view) allows this without importing and wiring to all of the WSDLs and schemas for the various service providers (see Figure 20).

Figure 20. Dynamic Service Gateway pattern
Dynamic Service Gateway pattern

The gateway sits between requestors and providers and typically makes routing decisions by using a well-known property of the incoming message, such as a SOAP Action HTTP header. It can act as a gateway to different service applications without dependencies on the interfaces of the services.

Propagating all messages without alteration

Since the Service Gateway pattern deals with the SOAP message as pure text, it is well-suited to passing faults messages from the provider straight back to the requestor without modification. The wiring is simple and does not require mapping. Figure 21 shows the response flow of such a gateway whose purpose is to log failed calls to a number of providers. The key benefit here is that the providers can have different interfaces and the gateway does not need to import the WSDLs.

Figure 21. Logging all fault messages using a Dynamic Service Gateway
Logging all fault messages using a Dynamic Service Gateway

A Dynamic Service Gateway does not make use of WSDL interface definitions for services it invokes. This means there is no concept of a modeled fault and all failures described by valid SOAP messages (HTTP 500 code) appear on the gatewayMessage terminal of the Callout Fault node.

If there is a system related failure (for example, HTTP 404) or no valid SOAP message is returned, then the Fail terminal of the Callout Response node fires. In Figure 21 above, the Fail terminal eventually connects to a Fail primitive that is used to send an unmodeled fault to the requestor.

Catching all failure messages

Another use of a Service Gateway pattern is to conditionally act as a pass through. The pattern described here acts as a pass through for successful service provider response calls. However, it will mediate invocation failures and SOAP fault messages by transformation. This is useful to prevent provider failure information, such as network details or exception traces percolating back to clients. It might be the case in that there are requirements to handle all failures in a consistent message format.

Options for gateway responses for the failed provider invocations include:

  • SOAP response with empty body (HTTP 200)
  • Normal SOAP response with a defined message type (HTTP 200)
  • SOAP fault response

The first option above is easily achieved using a Stop primitive.

The second and third cases are more difficult to achieve in a gateway scenario because the ServiceGateway interface that is available in the flow does not have strongly defined types for the payload. This is usually the benefit of the Service Gateway. The SMO will contain the raw data of the SOAP envelope in a string/XML representation.

There are two mechanisms to handle each of the response types:

  • Manual construction
  • Automatic parsing of the Service Gateway message into a concrete message

The following steps show how to manually construct responses for the more adventurous developers.

Manual construction: Creating a happy SOAP response from a Callout Fault

To create a SOAP response that matches a defined WSDL interface, first a message body matching the operation type must be populated and then it must be serialized to a string form suitable for the Input Response node.

The example in Figure 22 creates a normal response from a SOAP fault. The output type of the XSL map matches a response operation (operation1ResponseMsg) on an interface MyGateway.

Figure 22. Creating and populating a modeled SMO body type
Creating and populating a modeled SMO body type

The Data Handler then serializes the SMO body to a XML string that forms the basis of the gateway’s response, as shown in Figure 23.

Figure 23. SMO for the ServiceGateway with a TextBody message field
SMO for the ServiceGateway with a TextBody message field

The steps to create this flow are:

  1. Create an XSL primitive and set the Output terminal type to the intended response message type from a WSDL interface available in the project or its dependencies.
    1. Create a map with root=/body (or root=/ if you want to access SOAP Fault codes) to populate the target.

      Optionally, copy the SOAP Fault information from /headers,such as /headers/SOAPFaultInfo/faultstring to the target.

  2. Create a Data Handler primitive:
    1. Set the Data Handler Configuration as UTF8XMLDataHandler.
    2. Refine the body /body/message to the actual field type of {http://com.ibm.wbiserver.gateway/schema}TextBody. This is the type that contains the string field named value.
    3. Choose the Action to “Convert from a Business Object to native data format”
    4. The Source and Target XPaths are /body and /body/message/value, respectively.

Automatic parsing of the Service Gateway message: Creating a happy SOAP response from a Callout Fault

Within a Service Gateway, you may want to automatically inflate the inbound message from its generic TextBody structure into the concrete business object (without a DataHandler primitive). This is possible when a TextBody structure with XML data is encoded within the gateway message (such as a message from a web service binding) and the Integration Developer selects the Automatically convert the ServiceGateway message checkbox on any of the input nodes within the mediation flow editor (Figure 24).

Figure 24. Automatically de-serializing and serializing the gateway message
Automatically de-serializing and serializing the gateway message

A requirement for this processing is the availability of the schema information so that the message is de-serialized into a concrete business object.

This mechanism is especially useful when handling fault messages as the structure of a SOAP Fault can be complex to manually handle.

For instance, a SOAP 1.2 fault message includes the following information shown in Listing 4.

Listing 4. Example SOAP fault message representing unmodeled fault
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope">
  <soapenv:Body>
    <soapenv:Fault xmlns:axis2ns1="http://www.w3.org/2003/05/soap-envelope">
      <soapenv:Code>
        <soapenv:Value>soapenv:Receiver</soapenv:Value>
        <soapenv:Subcode>
          <soapenv:Value xmlns:m="http://Customer">m:CustomerIdFault</soapenv:Value>
        </soapenv:Subcode>
      </soapenv:Code>
      <soapenv:Reason>
        <soapenv:Text xml:lang="en"> Invalid customer Id : 12345</soapenv:Text>
      </soapenv:Reason>
      <soapenv:Detail>
        <io7:operation1Fault1_invalidCustomerId xmlns:io7=
            "http://Common/ProviderInterface">
           <reason>Invalid customer Id : 12345</reason>
           <code>INVALID_ID</code>
           <id>12345</id>
       </io7:operation1Fault1_invalidCustomerId>
      </soapenv:Detail>
    </soapenv:Fault>
  </soapenv:Body>
</soapenv:Envelope>

When this arrives into the Gateway, you want the details section to be automatically de-serialized into the body of the Service Message Object, and the remaining information to be populated into the SOAPFaultInfo section. This is exactly what occurs based on the above inbound message. The following Service Message Object in Listing 5 is generated.

Listing 5. Example SOAP fault message representing unmodeled fault
<p:smo xmlns:p="http://www.ibm.com/websphere/sibx/smo/v6.0.1">
  <context/>
  <headers>
    <SMOHeader>
      <MessageUUID>3E3BB9A4-0131-4000-E000-1F1809B4A751</MessageUUID>
      <Version>
        <Version>7</Version>
        <Release>0</Release>
        <Modification>0</Modification>
      </Version>
      <MessageType>Exception</MessageType>
      <Operation>requestResponse</Operation>
      <Action>http://Customer/customer/customer/updateCustomer</Action>
      <SourceNode>ServiceGatewayImport1</SourceNode>
      <SourceBindingType>WebService</SourceBindingType>
      <Interface>wsdl:http://www.ibm.com/websphere/sibx/ServiceGateway</Interface>
    </SMOHeader>
    <SOAPFaultInfo>
      <faultcode xmlns:ns0="http://www.w3.org/2003/05/soap-envelope">ns0:Receiver
       </faultcode>      <faultstring>Invalid customer Id :
                12345</faultstring>
      <extendedFaultInfo>
        <Code>
          <ns0:Value xmlns:ns0="http://www.w3.org/2003/05/soap-envelope">ns0:Receiver
           </ns0:Value>
          <ns0:Subcode xmlns:ns0="http://www.w3.org/2003/05/soap-envelope">
            <ns1:Value xmlns:ns0="http://Customer" xmlns:ns1="http://www.w3.org/2003/05/
             soap-envelope">
             ns0:CustomerIdFault</ns1:Value>
          </ns0:Subcode>
        </Code>
        <Reason>
    <ns2:Text xmlns:ns2="http://www.w3.org/2003/05/soap-envelope"
            xml:lang="en"> Invalid customer Id : 12345</ns2:Text>
        </Reason>
      </extendedFaultInfo>
    </SOAPFaultInfo>
    <HTTPHeader>
      <control>
        <ns3:URL xmlns:ns3="http://www.ibm.com/xmlns/prod/websphere/http/sca/6.1.0">
         http://localhost:9080/MyGateway_GatewayWeb/sca/ServiceGatewayExport1</ns3:URL>
      </control>
    </HTTPHeader>
  </headers>
  <body xmlns:ns0="wsdl.http://Customer/customer" xmlns:xsi="http://www.w3.org/2001/
    XMLSchema-instance" xsi:type="ns0:updateCustomer_invalidCustomerIdMsg">
   <io7:operation1Fault1_invalidCustomerId xmlns:io7="http://Common/ProviderInterface" >
           <reason>Invalid customer Id : 12345</reason>
           <code>INVALID_ID</code>
           <id>12345</id>
        </io7:operation1Fault1_invalidCustomerId>
  </body>
</p:smo>

This allows the Integration Developer to use standard transformation primitives to change the content of the message and generate a valid response message as shown in Figure 25.

Figure 25. Creating a normal response message
Creating a normal response message

You notice the final SetMessageType primitive is required to reset (from a tooling point of view) the message to a Gateway message structure, instead of the concrete type generated. The runtime only completes this logic within the InputResponse node, instead of at the SetMessageType primitive.

Manual construction: Creating a SOAP fault response from a failure

To return a fault response, it is necessary to serialize a BO of type Fault from the http://schemas.xmlsoap.org/soap/envelope/ schema (assuming you are using SOAP 1.1) into the value field of the TextBody. You can use the example shown in Figure 26 to return a modeled fault when the provider invocation fails with an HTTP 404 code.

The flow shown in Figure 26 first populates a SMO body using a map of the desired fault type (operation1_serviceProviderFaultMsg). It then copies this body into the detail element of a SOAP Fault structure defined in the transient context. Finally, it serializes the SOAP Fault as text using a Data Handler.

Figure 26. Creating a SOAP envelope with a Fault
Creating a SOAP envelope with a Fault

The steps to create this flow are:

  1. Import the http://schemas.xmlsoap.org/soap/envelope/ schema into your project or a referenced library. This (soap-1.1.xsd) was found in the .metadata\.plugins\com.ibm.ccl.soa.test.common.core directory in my workspace. Not finding it, nor having access to my workspace, it can be also be imported via HTTP using the Import WSDL feature of Integration Designer.
  2. Create a new BO type with a field named “fault” of type Fault from the imported schema. Declare this as the transient context variable.
  3. Create an XSL primitive and set the Output terminal type to the intended fault message type from a WSDL interface. Populate the map as required.
  4. Create a Message Element Setter and use to it to copy /body to /context/transient/fault/detail.
  5. Create a Data Handler primitive to serialize the fault structure into the gateway message body.
    1. Click the Browse button to create a new Data Handler Configuration. Select XML and specify Document root name as Fault and Document root namespace as http://schemas.xmlsoap.org/soap/envelope.
    2. Use this new configuration for the Data Handler because this is necessary to serialize the SOAP Fault correctly.
    3. Refine the body /body/message to the actual field type of {http://com.ibm.wbiserver.gateway/schema}TextBody.
    4. Choose the Action to Convert from a Business Object to native data format.
    5. Source and Target XPaths are /context/transient/fault and /body/message/value, respectively.

The value field in the TextBody is populated with a string that is a serialized SOAP Fault and starts with &lt;en:Fault and contains the modeled fault data in an inner &lt;detail element.

The above example has constructed a modeled fault response, but you can use this technique to create an arbitrary SOAP response.

Automatic parsing of the Service Gateway message: Creating a SOAP fault response from a failure

Similar to the section on Creating a happy SOAP response from a Callout Fault, it is possible to change the content of the fault section in a similar manner and allow WebSphere ESB to handle the de-serialization and serialization automatically. If you want to generate a SOAP Fault, then simply wire to the corresponding InputFault as shown in Figure 27.

Figure 27. Creating a fault response
Creating a fault response

If you change the content of the SOAPFaultInfo structure within the Service Message Object, these changes are automatically re-serialized into the SOAP Fault message. This approach greatly simplifies the setup and configuration for handling faults.

Responding gracefully but forcing a transaction rollback

Earlier we mentioned that an unhandled primitive failure or using a Fail primitive causes an unmodeled fault, and this causes a transaction rollback. Using a Stop primitive or returning a modeled fault does not cause a transaction rollback.

Sometimes, a service might want to return a “happy” HTTP 200 response (or a modeled fault) to the requestor, but also rollback the transaction (local or global, whichever applies). You can force a transaction rollback by using the following Java code:

javax.transaction.TransactionManager tm = com.ibm.ws.Transaction.TransactionManagerFactory
	.getTransactionManager();
tm.setRollbackOnly();

You can place this in a custom mediation primitive anywhere in a mediation flow before the response node.

Service Invoke primitive versus Callout nodes

The Service Invoke primitive has similar functionality to the Callout Fault nodes on response flows. It has a fault terminal for each fault declared on the partner interface.

For the sake of error handling, it does not usually matter whether a Service Invoke primitive or Callout node is used. Typically, the choice is made for other reasons. Service Invoke is used in more complex integrations such as aggregations, and callouts are used for more typical requestor-provider scenarios. One relevant benefit of the primitive is that a fail terminal is available for invocations of services with one-way (“fire and forget” or “request only”) interfaces (Figure 28). This allows handling of failures invoking a one-way service.

Figure 28. Using Service Invoke primitive to handle a one-way invocation failure
Using Service Invoke primitive to handle a one-way invocation failure

A slight drawback of the primitive is that is necessary to re-implement the primitive on the canvas if the faults are modified during development.

One other difference is that, by default, when a Callout Response node’s Fail terminal fires, it does not contain the message from the request flow. However, there is a checkbox property on the Callout Response node to preserve the request message if it is desired at some performance cost. The Service Invoke primitive always maintains the message from the In terminal when firing the Fail terminal.


Conclusion

During development when a failure occurs, the user can debug and investigate the issue and then retry. When a solution moves towards test or production, it is important for the right information to get logged and to handle failure in the manner expected by the specification. This can help the support team make a quick resolution for technical issues or identify what data caused the problem.

For a complete error handling strategy, the whole system needs to be considered. As an example, in a scenario where messages are delivered from a queue to a mediation, there are additional considerations when the flow instance fails. Options include immediately retry of delivery of the message, manual retry by an administrator (using Failed Event Manager), or moving the message to a failure queue. See the Resources section for links describing asynchronous scenarios and overall strategy.

This article has described some of the building blocks available in Integration Designer and WebSphere ESB and given some examples of how you might put them together in synchronous web services interactions.

Acknowledgements

The authors would like to thank Andy Garratt, Sergiy Fastovets, Gabriel Telerman, and Kim Clark for reviewing the article.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Business process management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Business process management, WebSphere
ArticleID=753570
ArticleTitle=Tracing, logging, and error handling in mediation modules using IBM Integration Designer, WebSphere ESB, WebSphere Process Server, and Business Process Manager Advanced Edition, Part 2
publish-date=08242011