Skip to main content

Put new capabilities of business activity monitoring (BAM) to work, Part 5: Managing failed and unrecoverable events with IBM WebSphere Business Monitor V6.1

Troubleshoot failed events easily and effectively

Miriam Celi (miriamc@us.ibm.com), Staff Software Engineer, IBM, Software Group
Miriam Celi is a staff software engineer with IBM, where she has been employed since 1999. She worked on several products in Boca Raton, Florida, and has worked on the administrative console for WebSphere Business Monitor since April, 2007.
Luis Sanchez (sanchezl@us.ibm.com), Software Engineer, IBM
Luis Sanchez
Luis Sanchez, a core member of the WebSphere Business Monitor development team, helped design and implement the Business Monitor server run time. He serves as the focal point for running Business Monitor in network deployment environments. Luis has worked for IBM in Research Triangle Park, N.C. since 1996, where he has been a part various projects. He has worked with Business Monitor for the last five years.
Aimee Silva (silvaa@us.ibm.com), Advisory Software Engineer, IBM Japan
Aimee Silva photo
Aimee Silva is the technical lead of the administrative console for WebSphere Business Monitor. She has more than 11 years of experience as a software engineer with IBM working on several products. She joined the WebSphere Business Monitor team in April 2007.
Tom Evans (TEVANS@uk.ibm.com), Software Engineer, IBM Japan
Tom Evans photo
Tom Evans is a software engineer on the IBM WebSphere ESB development team in Hursley, U.K. He has been with IBM for more than eight years, working on a number of WebSphere products, including four years developing WebSphere Messaging and the SIBus.

Summary:  In this series, learn about the dramatic changes in IBM WebSphere® Business Monitor V6.1—a major release that extends capability and simplifies how you monitor and manage the performance of your business. The WebSphere Business Monitor 6.1 enhanced, integrated administrative console lets you administer failed events. In this article, learn how to troubleshoot and manage failed and unrecoverable events for business monitor model applications.

View more content in this series

Date:  01 Apr 2008
Level:  Intermediate PDF:  A4 and Letter (47KB | 8 pages)Get Adobe® Reader®
Activity:  2033 views

Introduction

A failed event is an incoming event that has not been successfully processed and has caused part of a monitor model’s processing to stop in order to maintain data consistency. The tools in the WebSphere Business Monitor 6.1 administrative console let you inspect a failed event and take appropriate action to resume processing. An administrator can edit, delete, and reorder existing events to resubmit them for successful processing.

When the processing of an incoming event fails before the WebSphere Business Monitor manager can identify which part of the model the event was supposed to trigger, the event is unrecoverable. Though you can inspect unrecoverable events to try to determine the cause of a failure, they cannot be resubmitted.

This article assumes you:

  • Have already installed WebSphere Business Monitor V6.1.

    WebSphere Business Monitor overview has downloads and more information.

  • Have the sample monitor model for the banking industry installed, running, and processing events.

    For information on installing this sample, see the online help.

Background

When an exception occurs during the processing of events, the WebSphere Business Monitor manager classifies the exception as either soft or hard.

Soft exceptions
Exceptions that the monitor model can simply log and ignore without interrupting the processing of subsequent events. When the WebSphere Business Monitor manager encounters a soft exception, the exception is logged, an event is emitted with details of the exception, and processing continues with the next event.
Hard exceptions
Exceptions that might cause data consistency to be compromised as the monitor model continues to process events. When the WebSphere Business Monitor manager encounters a hard exception, it will stop the parts of the monitor model that cannot continue without compromising data consistency, while continuing to process other parts of the model.

The event that was being processed when the hard exception occurred is then placed on a failed events queue, where system administrators can go to determine the problem.

Hard exceptions

The list of hard exceptions is not fixed. Each version of a deployed monitor model can define which exceptions are hard exceptions. To specify hard exceptions (exceptions that stop event processing), edit the run time configuration error-handling properties of a deployed version of a monitor model, as shown in Figure 1.

To access the error-handling properties from the WebSphere administrative console, navigate to Applications -> Monitor models -> ModelID (ModelVersion) -> Run time configuration -> Error handling.


Figure 1. Error handling from run time configuration
Error Handling tab from run time Configuration

When you indicate here that an error is to stop event processing, the resulting exception is categorized as a hard exception. The online documentation has more information about error-handling configuration options.

Event sequences

To efficiently use system resources, the WebSphere Business Monitor manager processes some events in parallel. To avoid data consistency problems, sequences of events that must be processed in order are identified.

When the WebSphere Business Monitor model editor deploys a monitor model to an EAR file for execution on the WebSphere Business Monitor server, it identifies event sequences that must be run serially. These event sequences are assigned a key that uniquely identifies an instance of the event sequence.


Event sequence example

When the WebSphere Business Monitor model editor generates code, it will assume that all monitoring context hierarchies in the monitor model can define an event sequence. The key of the monitoring context at the root of the hierarchy will become the event-sequence instance ID.

In the example model shown, all events that correlate to Mortgage Lending BAM MC or Automated Loan Setup BAM MC will make up an event sequence. The event-sequence instance ID will be the MortgageLending Key—the key to the root monitoring context of the monitoring context instances.

All emitted events must contain the MortgageLending Key, and the model must be deployed to run with the 6.1 Multi-threaded processing strategy for the failed event handling to be enabled.
Event Sequence Example

At run time, if the processing strategy for the monitor model is 6.1 Multi-threaded, events corresponding to different event-sequence instance keys can be processed in parallel. If an error occurs while processing an event, events from event-sequence instances that do not correspond to the failed event can continue processing. Processing for events that correspond to the event-sequence instance of the failed event is halted, and the events are redirected to a queue for failed events.

Resubmitting events

With the administrative console tools, you can resubmit a previously failed event to a monitor model. The event will be processed, even if the event’s corresponding event-sequence instance is currently halted.

An event can be edited or reordered before you resubmit it. The results of processing the resubmitted event will be available on the administrative console.

Resuming failed event-sequence instances

You can resume a halted failed event-sequence instance from the administrative console. If there are unprocessed failed events when an attempt is made to resume an event-sequence instance, those events will be processed first, and the event-sequence instance will remain halted if any of the events causes a processing error.


Managing failed events

The WebSphere Business Monitor administrative console provides a graphical user interface for managing failed event sequences and unrecoverable events. Administrators can review the failed events for an event-sequence instance and can make changes so the event-sequence instance can be set back to a good state.

On the administrative console, you can drill down to see the individual failed event-sequence instances as well as the failed events that caused those instances to halt. Details and characteristics of the event-sequence instances and events, and other useful information for troubleshooting and managing failed event sequences and unrecoverable events, are available.

Models with failed event sequences

To view a list of models with failed event sequences in the administrative console, select Troubleshooting -> Monitor models -> Failed event sequences from the menu options on the left, as shown in Figure 2.


Figure 2. Manage failed event sequence
Manage Failed Event Sequence

This view displays a list of model versions that have failed event-sequence instances.

You can select a version of a monitor model and click Resume Processing to attempt to resume all of the failed event-sequence instances for the selected models. If a failed sequence instance still has unprocessed failed events, the unprocessed events will first be resubmitted for processing. When all resubmitted events for a specific failed event sequence are successfully processed, the event-sequence instance resumes normal processing. If an error occurs while processing one of the resubmitted events, its corresponding event-sequence instance will remain halted.

Information displayed in this view includes:

  • Model ID and version time stamp for the monitor model that has at least one failed event-sequence instance
  • Number of failed event-sequence instances for the model
  • First time an event failed to process (so that you have an idea of how long this model has had problems)
  • Status of the last resubmission of an event

Failed event-sequence instances

The failed event sequences window displays a list of failed event-sequence instances for a given model version. To view the failed event-sequence instances, click the number in the failed events column.


Figure 3. Failed event sequences
Failed Event Sequence Instances page

You can select an event-sequence instance and click Resume Processing to attempt to resume normal processing for the selected event-sequence instances. If a failed sequence instance still has unprocessed failed events, the unprocessed events will first be resubmitted for processing. When all resubmitted events for a specific failed event sequence are successfully processed, the event-sequence instance resumes normal processing. If an error occurs during the processing of one the resubmitted events, its corresponding event-sequence instance will remain halted.

An event-sequence instance cannot be resumed while it has any events that have not been resubmitted. If an event cannot be resubmitted, you may have to delete the event before the instance can be resumed.

You can select an event-sequence instance and click Resubmit to resubmit all the failed events for the selected event sequences. Because the event-sequence instance has been halted, new failed events might arrive after you click Resubmit.

Information displayed in this view includes:

  • Event-sequence instance name (usually the event-sequence instance ID)
  • Number of failed events for the event-sequence instance
  • Status and time of the last event that was resubmitted for the event-sequence instance

To view the failed events of an event-sequence instance, click the number in the failed events column.

Failed events

The failed events window, shown in Figure 4, displays a list of failed events for a given event-sequence instance.



Figure 4. Failed events table view
Failed Events table view

You can view the details of an event by clicking on an event in the event column. The following event-management functions can also be performed with selected events from the table.

Resubmit selected events
This option displays a confirmation window, as shown in Figure 5. You can choose to submit the events to the same monitor model version where the failure occurred, or to a different version of the model.

Figure 5. Resubmit event view
Resubmit Event view
Move up or down
You can move up or down the events to change the order in which they are submitted by selecting an event and using the Move Up or Move Down buttons.

Only one event can be moved at a time.

Delete selected events
Select an event and click Delete.
Import an event XML file
Import an event XML file to replace the selected event.

Only one event at a time can be imported for replacement.

Export the selected event XML file
This is useful for editing the event XML file to correct any problems and later importing it for resubmission.

Only one event can be exported at a time.

Import a new event
Import a new event to the list of failed events for later resubmission.

This view includes the event ID, the time the event was received, and the last time the event was resubmitted.

To view the details for a failed event, click on an event in the event column.

Failed event details

The failed event details window, shown in Figure 6, displays the data and details of a failed event, including the event ID, time received, last resubmission, failed message (such as any exceptions that occurred during event processing), and the event resubmission history. The event XML can be viewed by clicking the View Event XML link.


Figure 6. Failed event details view
Failed Event Details view

You can also select Import/Replace to import an event XML file to replace the currently displayed event or Export to export the event XML to a file.

Models with unrecoverable events

To view a list of models with unrecoverable events, in the administrative console select Troubleshooting -> Monitor models -> Unrecoverable events from the menu options on the right. As shown in Figure 7, this view displays a list of model versions that have unrecoverable events.


Figure 7. Unrecoverable events model view
Unrecoverable Events model view

Unrecoverable events fail processing before the WebSphere Business Monitor manager can determine its event-sequence instance ID. Unlike the failed events, there are no corresponding failed event-sequence instances listed here. You can delete all unrecoverable events for selected model versions by making a selection from the table and clicking Delete.

This view includes the model ID and version time stamp for the monitor model with unrecoverable events, the number of unrecoverable events for the model, and the last time a failure occurred for that model.

To view the unrecoverable events for a model, click the number in the unrecoverable events column.

Unrecoverable events

The unrecoverable events window, shown in Figure 8, displays a list of unrecoverable events for a given model version.



Figure 8. Unrecoverable events view
Unrecoverable events view

The administrator can delete unrecoverable events of a model version by selecting events from the table and clicking Delete. The data for a selected event in the table can be exported to a file by clicking Export.

Information in this view includes the event name and the time the event was received.

To view details of an unrecoverable event, click the event in the event column.

Unrecoverable event details

This view, shown in Figure 9, shows details of an unrecoverable event, including the time the event was received, the failed message, and details on the failure.



Figure 9. Unrecoverable event details view
Unrecoverable Event Details view

You can also select Export to export the details of the event to a file.


Conclusion

In this article, you learned the basics of troubleshooting and managing failed and unrecoverable events in WebSphere Business Monitor model applications. The administrative console for WebSphere Business Monitor offers many tools that help you easily and quickly troubleshoot and manage failed and unrecoverable events.


Resources

About the authors

Miriam Celi

Miriam Celi is a staff software engineer with IBM, where she has been employed since 1999. She worked on several products in Boca Raton, Florida, and has worked on the administrative console for WebSphere Business Monitor since April, 2007.

Luis Sanchez

Luis Sanchez, a core member of the WebSphere Business Monitor development team, helped design and implement the Business Monitor server run time. He serves as the focal point for running Business Monitor in network deployment environments. Luis has worked for IBM in Research Triangle Park, N.C. since 1996, where he has been a part various projects. He has worked with Business Monitor for the last five years.

Aimee Silva photo

Aimee Silva is the technical lead of the administrative console for WebSphere Business Monitor. She has more than 11 years of experience as a software engineer with IBM working on several products. She joined the WebSphere Business Monitor team in April 2007.

Tom Evans photo

Tom Evans is a software engineer on the IBM WebSphere ESB development team in Hursley, U.K. He has been with IBM for more than eight years, working on a number of WebSphere products, including four years developing WebSphere Messaging and the SIBus.

Comments (Undergoing maintenance)



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Sample IT projects, Architecture, WebSphere
ArticleID=298824
ArticleTitle=Put new capabilities of business activity monitoring (BAM) to work, Part 5: Managing failed and unrecoverable events with IBM WebSphere Business Monitor V6.1
publish-date=04012008
author1-email=miriamc@us.ibm.com
author1-email-cc=
author2-email=sanchezl@us.ibm.com
author2-email-cc=
author3-email=silvaa@us.ibm.com
author3-email-cc=
author4-email=TEVANS@uk.ibm.com
author4-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).