IBM WebSphere Developer Technical Journal: WebSphere Enterprise Scheduler planning and administration guide

This document is designed to give the reader an in-depth understanding of how to plan for, configure, administer and monitor the Scheduler service in WebSphere Business Integration Server Foundation Version 5.1 (and, previously, in WebSphere Application Server Enterprise Version 5.0.2), enabling high performance, high availability, persistence and transactional scheduling of J2EE operations.

Share:

Chris Johnson (c1johnso@us.ibm.com)IBM WebSphere Application Server Development

Chris D. Johnson is the technical lead for the WebSphere Application Server Scheduler, Asynchronous Beans, Startup Beans and Object Pool services and has been a developer in WebSphere since 2000.



14 April 2004

Introduction

The Scheduler service in WebSphere® Business Integration Server Foundation Version 5.1 (formerly WebSphere Application Server Enterprise) is a full-featured timer service that enables high performance, high availability, persistence and transactional scheduling of J2EE operations.

The Scheduler is comprised of two components:

  • Scheduler resource
  • Scheduler API.

The Scheduler resource represents a Scheduler instance that is available in the WebSphere Application Server Java™ Naming and Directory Interface (JNDI). Each Scheduler resource has unique properties that govern its behavior; for example, in which database to store the persistent schedules. The Scheduler resource is configured using the standard WebSphere Application Server administrative console (admin console) or the AdminControl scripting object.

The Scheduler API is a Java interface that enables creating and administering tasks. The API is accessible from any J2EE server application (Enterprise Java Beans and servlets).

The Scheduler enables the execution of two types of tasks:

  • Calling stateless session Enterprise Java Beans (EJBs).
  • Sending Java Message Service (JMS) Messages.

The Scheduler stores its data in any database that WebSphere Application Server supports and uses the WebSphere Application Server Transaction Manager. All Scheduler operations are therefore transactional and persistent; each task is guaranteed to run successfully one time. If a task fails for any reason, the entire operation is rolled back.

The Scheduler enables application developers to create their own stateless session EJBs to receive event notifications during a task's life cycle, allowing the plugging-in of custom logging utilities or workflow applications. Stateless session EJBs are also used to provide generic calendaring. Developers can either use the supplied calendar bean or create their own for their existing business calendars.

The Scheduler service is documented in the WebSphere Business Integration Server Foundation V5.1 Information Center, which describes basic installation and configuration procedures, simplified programming examples, and references the Scheduler API JavaDoc.


Planning

The Scheduler is part of the WebSphere Business Integration Server Foundation product and is required on all nodes where Scheduler activity is to run. The Scheduler service requires that a Scheduler configuration resource be configured and a J2EE application be configured to talk to the resource. Each resource is configured in much the same fashion as a DataSource or JMS Queue, and can be created at multiple configuration scopes (server, node or cell). You can create multiple Scheduler configuration resources and access each Scheduler resource from one or more J2EE applications.

User roles

The Scheduler is a service that requires several user roles to plan for, develop, administer and operate the scheduler service:

  • Administrator:
    Architects the use of the Scheduler within the organization's infrastructure. This involves creating the appropriate Scheduler configuration resources, tuning each Scheduler instance, assigning resources to applications and solving problems.
  • Developer:
    Creates J2EE applications that interact with the Scheduler service API. This includes administration applications (console applications) and applications that receive events (any application with which a Scheduler can interact).
  • Operator:
    Monitors the Scheduler for errors and uses the applications that the Developer has written to respond to error situations.

Resource Configuration

Each Scheduler configuration resource has parameters that govern how the Scheduler engine behaves for the resource and how to locate it in JNDI. When configuring a Scheduler resource using the admin console, the screen will look Figure 1:

Figure 1. Scheduler configuration panel
Figure 1. Scheduler configuration panel

If two Scheduler resources are configured with the same JNDI name at different scope levels, the lowest granular level will take precedence.

Figure 2. AccountReport Scheduler configuration panel
Figure 2. AccountReport Scheduler configuration panel

Each Scheduler configuration resource has the following parameters (also listed in Figure 2):

ParameterDescription
NameThe name of the resource. Can be any resource-unique value.
JNDI NameThe unique JNDI name. This is the name by which a resource is publicly available through JNDI.
DescriptionAny description.
CategoryAny category name.
DataSource JNDI NameThe JNDI name of any configured DataSource visible within this resource&s scope. The DataSource defines the location where the Scheduler stores all tasks that are created. See Database configuration for more details.
DataSource AliasThe J2C authentication data entry that defines the user credentials required to access the DataSource. This can be left blank if the database does not require a user ID or password.
Table PrefixThe Scheduler requires several database tables. It may be desirable to share a single DataSource among several Scheduler resources. This value will be added to the beginning of the actual table names that the scheduler uses. In addition, if the database requires a schema name (for example myschema.mytable, you can enter the schema name (with the dot) here. See Database configuration for more details.
Poll IntervalEach Scheduler resource has a poll daemon thread. This value (in seconds) will instruct the poll daemon thread to wait this long between polls. See Poll daemons for more details.
Work ManagerThe Work Manager is used to control how many tasks can be run simultaneously, and what J2EE context information to propagate from the application that creates the task to the task thread when it runs. See Work Manager for more details.

Poll daemons

The Scheduler resource will have one poll daemon thread active for each server for which the Scheduler service is enabled. Therefore, if a Scheduler resource is configured at the node scope level, each server in that node will have a poll daemon running in it.

The poll daemon is responsible for loading tasks from the database. The daemon uses the Poll Interval setting on the Scheduler configuration resource to determine the amount of time to wait between database polls. If the value is 60, then the daemon will attempt to load tasks every 60 seconds for all tasks that are set to fire during that cycle.

Figure 3. Poll daemon with 60-second interval
Figure 3. Poll daemon with 60-second interval

The Poll Interval setting determines the minimum frequency at which a repeating task can fire. It also determines the amount of tasks that are loaded into memory at one time. Therefore, creating a large poll interval of 24 hours may not be the best choice, even if you only run repeating tasks once per day. Each task would be loaded into memory consuming resources. Creating a small poll interval of 1 second may seem like the right thing to do. However, this will create additional database contention. A good practice is to choose a value between 5 and 3600 (1 hour) seconds, taking into account the smallest repeating interval that your tasks will require.

Work Manager

The Work Manager setting for the Scheduler configuration resource provides the Scheduler with a fixed number of threads to dispatch work on and a policy that determines how to propagate J2EE context information onto a thread.

Figure 4. Default WorkManager configuration panel
Figure 4. Default WorkManager configuration panel

The WorkManager parameters that apply to the Scheduler are as follows:

ParameterDescription
Alarm PoolThe number of threads that can dispatch tasks at one time minus 1. The Scheduler uses one Alarm thread internally. The rest of the Alarm threads will be used exclusively for dispatching tasks. Figure 3 shows an example where there are two Alarm threads dispatching tasks from a single poll daemon thread.
ServicesThe services that are active for the task creator's thread context that will be applied to the task thread when it runs. For example, if the "security" context is enabled and WebSphere Application Server global security is enabled, the security credentials that are active on the application that calls the Scheduler.create() method will be applied to the task's thread prior to the task executing. If "Bob" created the task, then "Bob" will also be running the task.

A WorkManager can be shared between more than one Scheduler and used for non-Scheduler purposes. This can be useful if you want to have a single thread pool and alarm pool for multiple applications and services. Keep in mind that although a WorkManager can be configured at the Cell and Node scopes, the thread pool and alarm pool is duplicated in each active server. See Figure 5 for an illustration of how different servers will have a different WorkManager instance but the same amount of available threads and alarms.

Figure 5. Shared Node-Level WorkManager
Figure 5. Shared Node-Level WorkManager

WorkManager Services
The WorkManager has several service contexts that can be propagated to the Scheduled task when it is fired. Each of these services are only propagated if the service is installed and enabled on the application server used to both create and fire the task. All of the WorkManager service contexts apply only to BeanTaskInfo tasks.

ServiceDescription
SecurityThe JAAS Principal is stored with the task when it is created and a Subject is restored for that Principal when the task runs. See Task security for more details.
InternationalizationThe Internationalization Context (or the java.util.Locale and java.util.TimeZone) is stored with the Scheduled task when it is created and is reapplied to the thread when the task runs.
Work AreaAll Work Area context information active on the thread is stored with the Scheduled task when created and reapplied to the thread when the task runs.
Application ProfileApplication Profile tasks are stored with the Scheduled task when created and reapplied to the thread when the task runs.

High availability

The Scheduler service can be configured such that it is highly available by creating duplicate Scheduler resources or by creating a resource in a cluster. The Scheduler in WebSphere Application Server Enterprise Version 5.0.2 and WebSphere Business Integration Server Foundation Version 5.1 uses a lease concept to minimize collisions between the separate poll daemons. The redundant Scheduler engines will compete for leases and the Scheduler that wins the lease will run the tasks. If a Scheduler does not obtain the lease, the poll daemon will not attempt to load and run any tasks.

A lease is shared among schedulers that use the same JNDI names and database tables. Scheduler resources that are therefore configured at the cluster level will automatically take advantage of leases.

Leases are obtained using an independent alarm thread of each Scheduler's WorkManager. The time at which a lease acquisition is attempted is somewhat smaller than the Poll Interval (64% of the Poll Interval). The lease itself expires at 80% of the poll interval. Therefore, if the poll interval is 100 seconds, the lease will expire every 80 seconds. The lease alarm will attempt to renew or obtain a lease every 64 seconds (80 * .8). If a Scheduler becomes unavailable, then the maximum time which the Scheduler could be unavailable will be ((PI * .8) + (PI * .64)) which equates to 80 seconds (for the lease to expire) plus 64 seconds (for the backup scheduler to acquire the lease) for a total of 144 seconds.

Versions of the Scheduler after Version 5.0.2 use a different algorithm for setting the lease time that is independent from the poll interval. This allows customers to use larger poll intervals without sacrificing availability. With versions later than 5.0.2, the lease expires every 60 seconds and is renewed or acquired by all daemons every 40 seconds. Therefore, the maximum time which a scheduler could become unavailable will be 100 seconds regardless of the poll interval.

About leases
Leases were not used prior to Version 5.0.2. When redundant Schedulers were added, availability was increased; however, contention also increased. You could not add more then one redundant Scheduler without sacrificing performance. Each task would be loaded and run on each server, but only one would run successfully. All other duplicate tasks would simply abort when the collision was detected.

If the database for the Scheduler you are using was created using the Data Definition Language (DDL) files supplied with the Version 5.0 or 5.0.1 version of the Scheduler, you will not have a Lease Manager. To activate the Lease Manager in WebSphere Application Server Enterprise Version 5.0.2 or WebSphere Business Integration Server Foundation Version 5.1, simply create the new Lease Manager tables that are present in the DDL files supplied with the Scheduler. By re-running the DDL that creates the tables, the new tables will be created without affecting existing data (see Resources for details on how to create the tables). Once the tables are created, the Scheduler will automatically start using leases to manage redundant Scheduler contention.

In Figure 6, a Scheduler resource is duplicated on three distinct servers in the same cell. Each Scheduler (with JNDI Name sched/Main) is referencing the same JDBC DataSource and WorkManager. The poll daemon for the Scheduler on Server1 of Node A has the lease and will load and process tasks. The other two poll daemons on Nodes B and C will remain idle until the scheduler daemon on Server1 is no longer able to renew its lease due to failure.

Figure 6. Redundant Schedulers with independent servers
Figure 6. Redundant Schedulers with independent servers

Alternatively, in Figure 7, with the WebSphere Network Deployment Manager, each server could instead be a cluster member. Each Scheduler resource is created on each cluster member, and each server has a running Scheduler instance. In a complex topology, this method is preferred.

Figure 7. Redundant Schedulers in a cluster
Figure 7. Redundant Schedulers in a cluster

Recovery and delays

The Scheduler uses fixed-delay calculation times. When a Scheduler becomes overloaded due to insufficient resources or long-running tasks, tasks may run late (see Task latency in the Performance section). When recurring tasks are late executing, the next fire time is calculated on the actual fire time. Figure 8 illustrates what will happen to a task's fire time during an outage or delay if using the SIMPLE calendar, which calculates fire times based on relative versus absolute time deltas. The bottom arrows indicate how tasks would run without any delays. The top arrows indicate what actual fire times will be, given that tasks 2 and 3 are missed due to an outage or delay. Here, the task that would normally have fired at 2:00 instead fires at 3:15, immediately after the Scheduler became available.

Figure 8. Execution time delay with SIMPLE calendar
Figure 8. Execution time delay with SIMPLE calendar

If a Scheduler fails, all tasks will run immediately in expire-time order once the Scheduler is restarted. All tasks that were in the process of running will be re-run if they did not complete successfully (its transaction committed). The repeat count will reflect tasks that have run successfully (see EJB transaction considerations for more details).

Scalability

The Scheduler will scale vertically with the addition of more processors and by increasing the number of alarms available on the WorkManager associated with the Scheduler. Using the Tivoli® Performance Analyzer (see Performance) alarm latencies can be identified and more processors and alarms can be added.

The Scheduler does not have the ability to natively scale horizontally. Because the Scheduler cannot be natively partitioned, it is up to the administrator and developer to partition the application over more than one Scheduler, create redundant Schedulers that talk to different databases or tables, or to partition the work that the Scheduler is executing, thereby increasing the available resources for the Scheduler daemon.

Partitioned Scheduler daemon
To illustrate how an Administrator could partition a Scheduler into two partitions, examine Figure 9. Here, there are four nodes each with one server in a single cell. An application is installed into the cell and deployed on all four servers. On each of the servers, a Scheduler resource is defined. The WorkManager and DataSource that each of the Scheduler resources reference are configured at the Cell scope.

Figure 9. Partitioned Scheduler
Figure 9. Partitioned Scheduler

Normally, in this scenario, all Schedulers would communicate with the same database tables. In this case however, the servers on Nodes B and C are talking to tables with prefix MAIN1, and Nodes A and D are talking to tables MAIN2. This means that the Schedulers on Nodes B and C are redundant and separate from the redundant schedulers on Nodes A and D. With this configuration, the application remains the same, but the Scheduler work is partitioned among two different nodes. Although it would be possible to combine Nodes A and B together to form Node X, and Nodes C and D together to form Node Y, it is difficult to guarantee that one node will not get both leases for MAIN1 and MAIN2. The Scheduler service does not have the concept of a "preferred" daemon. To force one Scheduler to obtain the lease in this scenario, the administrator would need to either:

  • Delay starting of one poll daemon until the preferred poll daemon is active and has obtained the lease (performs its first poll), or
  • Stop one poll daemon until a redundant obtains the lease, which could take as much time as a Poll Interval (see Forcing lease acquisition).

With this type of scenario, application developers and administrators must understand that because the Schedulers are now partitioned, applications will only see a subset of the scheduled tasks. Therefore, the application will need to look for tasks on both partitions when administering tasks that have been scheduled. For example, if a task was created on Node A and the operator needs to cancel that task, the operator would need to know that it was created on Node A or would need to try to cancel it on all nodes.

Partition Scheduler with forced leases
It may be more desirable for applications to be written such that tasks are categorized differently and, likewise, use different Schedulers for subsets. For example, creating a Scheduler for handling requests from employees with odd versus even ID numbers. Again, the only way to guarantee that a lease is obtained on each Node is to force lease acquisition (see Forcing lease acquisition).

Figure 10. Partitioned Schedulers with forced leases
Figure 10. Partitioned Schedulers with forced leases

Forcing lease acquisition
If a cluster of Schedulers is created, it may be desirable to have one server own the lease for the Scheduler (to have all tasks run on that server); for example, if server big_server has a lot more capacity to run the tasks. You can force big_server to own the lease using one of these methods:

  1. Start the Scheduler cluster one server at a time where big_server is the first server to start (an actual server cluster or simply a collection of servers that use the same Scheduler configuration).
  2. If the servers are already active, the WASScheduler MBean's stopDaemon operation can be called on all Scheduler servers. This will release the lease. Once all daemons are stopped, the startDaemon operation on big_server should be called, followed by the other servers. This will allow big_server to obtain the lease.

Partitioning applications
It may be easier and more desirable to simply partition the application into two pieces. One piece would be the portion that schedules tasks and the other portion would run the tasks. Here, the application that is executing the tasks could exist on a different cluster. In Figure 11, the application used to create and administer tasks is in one set of nodes (could be a cluster) and the application that runs the tasks (BeanTasksApp.ear) is in a separate cluster. Because the tasks are EJBs, and because they are in a separate application and cluster, the Scheduler uses Workload Management (WLM) to distribute and balance the work among each cluster member.

Figure 11. Partitioned application - separate Scheduler
Figure 11. Partitioned application - separate Scheduler

Service configuration

Each application server where the Scheduler is installed will have a Scheduler service configuration panel. The service configuration panel simply enables disabling the Scheduler service for a given application server. When this option is disabled, all Scheduler activities for the given server will become unusable. No poll daemons will run and the JNDI lookups of Scheduler configuration resources will not be available. Applications that have resource references to com.ibm.websphere.Scheduler will fail to load. This option is only useful in two scenarios:

  1. The Scheduler is not being used at all.
  2. The poll daemon needs to be disabled on one server.

Database configuration

The Scheduler uses a user-defined database to persist the tasks that are created. The poll daemon then uses this database to determine what tasks to run and when to run them. The Scheduler service database tables are created by editing and executing supplied Data Definition Language (DDL) (or SQL) files in the customer's database management system. The details for creating the tables are available in the WebSphere Business Integration Server Foundation V5.1 Information Center.

This section describes some of the issues that Scheduler administrators must know when configuring Schedulers and the respective databases, and issues that Scheduler developers and architects must know when developing applications.

Scheduler interaction
The Scheduler interacts with the database using four different threads:

  • Application thread using the com.ibm.websphere.scheduler.Scheduler interface methods.
  • Poll daemon thread.
  • Each alarm thread that the task runs on.

Each Scheduler uses four tables (each prefixed with the table prefix of the Scheduler configuration):

  • TASK: Contains all of the tasks that have been created. There will be one row in this table for each Scheduled task.
  • TREG: Contains various configuration information used internally by the Scheduler.
  • LMGR: Contains lease Manager information.
  • LMPR: Contains lease Manager information.

All database access uses a single shared connection for each thread, read-committed transaction isolation and row-level locking.

Estimating database size
The only table that stores large amounts of data is the TASK table. There is one row in this table for each scheduled task. There is a minimum number of bytes of 535 per scheduled task. There is a single Binary Large Object (BLOB) column in the table that is used to store arbitrary data, including task-specific data: For BeanTaskInfo objects, this is the home handle of the TaskHandlerHome. For MessageTaskInfo objects, this will include every set field including the message data. The BLOB column will also include J2EE context information from the WorkManager associated with the Scheduler. For example, if the internationalization service is enabled on the WorkManager, then the internationalization security context information will be stored with the BLOB. The BLOB will typically range from 3000-5000 bytes.

Connections
Administrators need to make sure that the maximum number of connections on the database and DataSource has enough connections available for the poll daemon thread, alarm threads, and the J2EE application threads that access the Scheduler API. All database connections are shared by a Scheduler. The maximum number of simultaneous connections in use per Scheduler can be calculated by using this formula:

(1 Poll Daemon Thread) + (x-1 Alarm Threads) + (y API threads)

Therefore, if you have a Scheduler with a configured WorkManager that has five alarm threads, the applications interacting with the Scheduler API use a maximum of two concurrent threads, then the total number of connections required on the DataSource and the database (for this Scheduler only) would be: 1+ (5-1) + 2 = 7 connections.

Transactions
Each interaction with the database is done in a transaction. If a global transaction is active on the thread when a Scheduler API is called, the same transaction will be used. If a global transaction is not active on the thread, then the API will create its own global transaction. All database interaction and EJB interactions will occur in the same transaction. Therefore it is important to keep this in mind when choosing either a 1-Phase Commit (PC) or 2-PC capable database resource for the Scheduler to use and the appropriate transaction attribute on the EJBs.

Under most circumstances, 2-PC, XA-capable DataSources should be used for the Scheduler configuration. 1-PC and non-XA-capable DataSources can be used but only if one of the following is true:

  1. None of the EJBs that a Scheduled task uses are configured to use a container transactional context of Mandatory, Required, or Supports (this includes NotificationSink, UserCalendar and TaskHandler beans). See EJB transaction considerations.
  2. MessageTaskInfo task types are not used. These tasks will always attempt to join the global transaction.
  3. The application scheduling the tasks has the Accept Heuristic Hazard option enabled (also known as Last Participant support). See the WebSphere Application Server Enterprise Information Center for details on this option.

Deployment

Each J2EE application is typically mapped to one or more Scheduler instances through resource references. When tasks are created in a Scheduler instance, it is important that the lifecycle of the Scheduled tasks are taken into account when the application is started, stopped or uninstalled.

The Scheduler service is not application-aware. If an application is stopped or uninstalled, tasks for that application will continue to run. If an application is to be uninstalled, application developers and administrators must cancel all outstanding tasks or they will continue to run and errors will be displayed in the SystemOut.log:

ASYN0030E: Cannot find meta data for j2eename App.ear#Web.war#/welcome.jsp

And in the Activity Log:

SCHD0010E: An error occurred while firing a task.
SCHD0102I: javax.naming.NameNotFoundException: jms/MyQueue

Resource mapping
Although using a resource-ref or resource-env-ref in a J2EE application is optional, it is usually a good idea. When application developers use resource references (for example, java:comp/env/sched/FinanceScheduler), this allows application administrators to map the Scheduler reference to a specific Scheduler instance based on quality of service, performance and availability.

EJB transaction considerations
Each Scheduler task can make use of three different EJBs:

  • UserCalendar:
    Used to determine the date of a task's next scheduled fire time, and whether the start-by time has elapsed when the task fires.
  • NotificationSink:
    Used to receive various notification messages throughout the lifecycle of a task.
  • TaskHandler:
    The mechanism for executing work when a task fires for a BeanTaskInfo task type.

If each of these beans are configured with a container transaction type of Requires, the EJB will participate in the same global transaction that was used to update the database and other Requires EJBs. For example, if a UserCalendar bean is configured as Requires and the TaskHandler bean throws an unexpected exception or the transaction is rolled-back, then any work within the UserCalendar bean will be rolled-back as well.

Typically, NotificationSink and UserCalendar beans do not require participation in the global transaction. It may be more efficient to configure them as the Not Supported transaction type. TaskHandler beans typically would always participate in the global transaction. See Transactions.


Administration

Since the Scheduler service does not have a user interface of any kind (other than resource configuration), it is generally the application developer's responsibility to build administration functions into the application. For example, if a customer service application uses the Scheduler to fire reminder messages to a customer service representative, it would be the application's responsibility to track the lifecycle of the task.

Task administration

The Scheduler API interface provides methods for retrieving tasks by task ID and by name:

  • TaskInfo getTask(String taskID)
  • TaskStatus getTaskStatus(String taskID)
  • Iterator findTasksByName(String name)
  • Iterator findTaskStatusByName(String name)

Since the Name field only contains room for 254 characters, it is difficult for applications to associate tasks with business logic by using the name field alone. It is more practical instead to store the application data using the task ID as a key.

Service monitoring

Although it is generally the application developer's responsibility to handle task administration duties, there are still several ways to determine the health and progress of the Scheduler as a whole. Specifically, the WebSphere Application Server system logs (SystemOut.log, SystemErr.log and activity.log) each provide basic information on the state of the Scheduler service and each of the configured Scheduler instances. Each Scheduler instance also has an associated MBean that allows simple Scheduler operations that are not accessible through the API.

MBeans
The Scheduler service exposes a WASScheduler MBean that can be accessed through JMX using the WebSphere wsadmin tool or the AdminClient API. Each Scheduler can be located by its name which is formatted: Scheduler_[JNDI_NAME] , where [JNDI_NAME] is the global JNDI Name of the Scheduler with each slash replaced with a period. If there is a Scheduler instance configured as sched/MainScheduler, the MBean name would be Scheduler_sched.MainScheduler.

Each Scheduler MBean has the following operations and attributes:

OperationAttributeDescription
pollIntervaljava.lang.LongDisplays or alters the effective poll interval. It will take effect at the next poll cycle.
startDaemonjava.lang.Integer delayStarts the poll daemon with a delay in milliseconds. All Scheduler instances will automatically start when the server is started.
stopDaemonStops the poll daemon from executing tasks until it is restarted.

By looking up the Scheduler instance's MBean, one can change the poll interval dynamically without restarting the scheduler, or stop/start the poll daemon to temporarily suspend the execution of any tasks. None of these MBean operations or attributes modifies the Scheduler resource directly. Therefore, to change the poll interval permanently, the resource will need to be changed using the admin console or the wsadmin $AdminConfig object.

Example: Changing the Poll Interval using wsadmin:

wsadmin> set s [$AdminControl queryNames 
WebSphere:*,type=WASScheduler,name=Scheduler_sched.MainScheduler]
wsadmin> $AdminControl invoke $s setPollInterval 10000

Monitoring tasks

The best way to monitor a task's lifecycle is to use a NotificationSink bean. The NotificationSink allows an application to capture the various stages of a task. It could be used to develop a history of a task's execution. The WebSphere Application Server log files and the Scheduler's TaskStatus and TaskInfo objects can also be used to retrieve information from a task. However, the NotificationSink is a much more effective and complete solution. See History logging for more details.

Log files

The Scheduler service utilizes the WebSphere Application Server system logs to report its state and any serious problems that are not application-specific. Beginning with WebSphere Application Server Enterprise Version 5.0.2, a normal SystemOut.log would have the following information in it to indicate that the Scheduler Service has started successfully:

SCHD0036I: The Scheduler Service is initializing.
SCHD0037I: The Scheduler Service has been initialized.
SCHD0031I: The Scheduler Service is starting.
SCHD0001I: The Scheduler Service has started.

Each configured Scheduler instance will also have the following messages:

SCHD0032I: The Scheduler Instance {JNDI Name} is starting.
SCHD0038I: The Scheduler Daemon for instance {JNDI Name} has started.
SCHD0033I: The Scheduler Instance {JNDI Name} has started.

During normal processing, the SystemOut.log will not have any further information unless a significant error has occurred. If tasks are not executing as they should, this log, as well as the SystemErr.log activity.log and FFDC logs, should be consulted to determine the cause of the failure.


Security

The Scheduler addresses two security aspects:

  • Task security: Who does the task run as?
  • Administration security: Who can modify tasks?

Access to the Scheduler service configuration and runtime attributes through the admin console and MBeans are managed using the global role-based authorization settings in the admin console.

Task security

When a Task is created, the Java Authentication and Authorization Service (JAAS) is used to retrieve the Subject (user ID) that is on the thread of execution. When the task is stored in the database, the Principal is extracted from the subject and stored with the task. When a task fires, a new Subject is created with the saved Principal and is, in turn, used to run the task. To achieve this, Identity Assertion from the Common Secure Interoperability Version 2 protocol (CSIv2) must be enabled on WebSphere Application Server. (See the security section of the WebSphere Business Integration Server Foundation V5.1 Information Centerfor details on Identity Assertion.) This service context only applies to BeanTaskInfo tasks that do not have Run as Specified defined for the process method.

To change the Principal stored with a task, the task must be recreated with the correct Subject active on the thread of execution.

Although this information is stored with a task regardless of the task type, it is only used by the BeanTaskInfo task type. The MessageTaskInfo allows specifying a user ID and password, which is stored with the task information in the database BLOB.

Administration security

Tasks can be created by any application. If an application can look up a Scheduler instance, it can therefore create tasks. The only application that can suspend, resume, cancel or purge a task is the same one that created it. All tasks are isolated to a single application. The find methods of the Scheduler will only return tasks scoped to the current application. All other methods will return a java.lang.SecurityException if a task is accessed from a different application.


Performance

The Scheduler is highly optimized to load and run tasks with minimal database interaction and is capable of executing hundreds of tasks per second for each JVM. However, several configuration and environment issues can cause the Scheduler's performance to decrease. This section describes how to monitor and adjust the Scheduler and applications to achieve optimal performance.

Performance viewer

The Tivoli Performance Viewer is a tool supplied with WebSphere Application Server that allows administrators to monitor various performance-related statistics for several services. There are two instrumented services that determine Scheduler service performance:

  • Scheduler service
  • Alarm manager.

Scheduler service
Each Scheduler instance has its own performance statistics. The name of the each statistic group is the same as the MBean name (see MBeans).

CounterDescription
Failed tasksTracks the number of tasks that fail due to application or resource configuration problems. This includes any exception thrown by the TaskHandler bean or problems sending messages with a MessageTaskInfo. When tasks fail, it is the application developer or administrator's responsibility to diagnose and resolve the problem. See Monitoring Tasks and Troubleshooting.
Executed tasksTracks the number of tasks that have run successfully.
Tasks per secTracks the number of tasks that have run successfully each second.
Collisions per secCollisions occur when a task is updated in the database while it is loaded in memory. This counter shows the number of collisions that have occurred each second. This can happen in several instances:
  • Multiple Schedulers are actively processing tasks simultaneously and the lease manager is not enabled. For example, if two identical schedulers are active simultaneously, they will compete to run the same tasks. If the lease manager is not available (see Lease notes for details) each task will be loaded into memory, but only one of them will obtain the lock to the database record and update its row counter. The other task will detect the collision and abort the execution of the duplicate task.
  • An application has modified the task state. For example, the poll daemon has loaded a task into memory, and the application has suspended and resumed the task. The task that has loaded into memory will fail to run since the task has been updated, but it will be reloaded during the next poll cycle.
Because the Scheduler guarantees that a task will only run one time successfully, if a task is modified while loaded in memory, it will automatically detect when it has been updated and will abort execution and log a collision in performance statistics.
Poll query timeTracks the time it takes to run the query that the poll daemon uses to load the tasks during each poll cycle. If this value is excessively large, the database or connection to the database may need to be tuned, or the Scheduler will need to be partitioned.
Task execution timeTracks the time it takes to run the task itself. For EJBs, this value is the amount of time that it takes to run the TaskHandler.process() method.
Tasks executed per pollTracks the number of tasks that have run (successfully or not) per poll cycle.
Tasks expiring per pollTracks the number of tasks that each poll daemon thread has loaded into memory. If this value is greater than the Tasks Executed per Poll counter, then the Scheduler may have exceeded its threshold for executing tasks concurrently. Also see Alarms Pending.
Task latencyTracks the number of milliseconds that tasks are delayed. Tasks can be delayed because of a poll interval that is too large or resource constraints on the application server.
Poll timeTracks the actual time between poll cycles. If this value is greater than the configured poll interval (see poll daemons) the number of tasks executing per second is too high or there may be a performance issue with the database.
Number of pollsTracks the number of poll cycles that have occurred since performance monitoring was enabled.

Alarm manager
The alarm manager is a part of the WorkManager. The number of alarm threads gauge the number of tasks that can run concurrently (see Work Manager). If a WorkManager is shared among more than one Scheduler or other applications, the alarm manager data counters can show if this is causing problems.

CounterDescription
Alarms pendingTracks the number of tasks that are loaded in memory from the poll daemon that are ready to fire. If this value is excessive, the Scheduler is saturated and tasks will begin to run late. Increasing the number of alarms, decreasing the poll interval, partitioning the work or allocating more memory and CPU may be required.

Troubleshooting

When diagnosing problems with the Scheduler service, administrators will typically refer to the WebSphere Application Server log files. It is desirable in most cases to develop a history logging mechanism that is application-specific using the NotificationSink notification beans supplied with the Scheduler.

Logs

The Scheduler uses the standard WebSphere logging mechanisms to report status and log errors. See the Log files for examples of messages that can be seen in the log files.

LogDescription
SystemOutSystemOut.log contains the majority of the errors and messages, categorized into three severities:
  • Error:
    Errors include configuration and database problems as well as unattended, unexpected scheduler errors.
  • Warning:
    Warnings include non-Scheduler-specific errors and configuration problems that are application specific. Examples include: application not started, EJB not found, unchecked EJB exceptions, missing JMS Queue and security errors, etc.
  • Informational:
    Informational messages include Scheduler status messages.
SystemErrExceptions that are associated with errors displayed in SystemOut.log.
Activity logThe activity log should be used to further diagnose problems that are not apparent from the SystemOut.log and SystemErr.log logs.
FFDCThe First Failure Data Capture logs should be used to further diagnose problems that are not apparent from the SystemOut.log, SystemErr.log logs and activity logs. FFDC errors are typically exceptions that may indicate a problem.
TraceThere are three trace specification strings that can be used to diagnose problems in the Scheduler. The resulting trace file is stored in a file named trace.log (see the WebSphere Information Center for more details on enabling trace and locating the trace log):
  • Scheduler=all=enabled
    Shows the majority of all Scheduler information.
  • ExtHelper=all=enabled
    Shows transaction boundaries and database interaction.
  • LeaseManager=all=enabled
    Shows the lease manager's activity.

History logging

The Scheduler does not maintain a history of events. The NotificationSink and TaskNotificationInfo interfaces, however, enable you to create custom EJBs that can be used to log the following events to a database:

  • CANCELLED
  • COMPLETE
  • FIRED
  • FIRE_DELAYED
  • FIRE_FAILED
  • FIRING
  • PURGED
  • RESUMED
  • SCHEDULED
  • SUSPENDED

Each of the events is documented in the JavaDoc for the com.ibm.websphere.scheduler.TaskNotificationInfo interface.

For example, to create an EJB that will perform history logging, the following steps would need to be performed by the application developer:

  1. Create an EJB that implements the com.ibm.websphere.scheduler.NotificationSink remote interface and the com.ibm.websphere.scheduler.NotificationSinkHome home interface. The EJB implementation class simply needs to implement a method named: void handleEvent(TaskNotificationInfo task)
  2. The EJB should use a container transaction type of Requires New. If this is not done and the task fails, then the NotificationSink bean will be rolled-back which will include the database write operation.
  3. Create a CMP Entity Bean that will log the events for the task in a database with the application data. The database would simply record the state of the tasks as they fire. The application can also use the CMP bean to locate the last state for the task, when each task fired, failed, etc.
  4. The NotificationSink bean implementation would simply log the following to the database from the TaskNotificationInfo object which is passed to the handleEvent method:
    • Task ID
    • Event Type (FIRED, COMPLETE, etc.)
    • Event Time.
  5. When creating a new task, then the application developer will do the following:
    1. Lookup the history logging NotificationSink bean.
    2. Use the setNotificationSink() method on the com.ibm.websphere.scheduler.TaskInfo interface to tell the Scheduler to associate the history logging bean to the task.
    3. Create the task with the create() method on the Scheduler interface.

At this point, each time the Scheduler fires an event notification for the task, the history logging bean will be called with the state. If only a subset of events is desired, the optional eventMask parameter on the setNotificationSink() method can be set. For example, if only the FIRED, FIRE_FAILED and COMPLETE messages should be logged, the eventMask can be set to: (FIRED & FIRE_FAILED & COMPLETE). When the event mask is set, the NotificationSink bean will only be called when the specified events fire.


Conclusion

The Scheduler service can be used to effectively extend the functionality of J2EE applications by providing a reliable, high performance, persistent and transactional timer service. Because the Scheduler is available as an integrated WebSphere service, this allows administrators to centralize administration and reduce overhead of separate processes. When the Scheduler API is configured in a clustered environment and is fully utilized, the Scheduler service provides a highly flexible and full function job scheduler that can be used by any J2EE application.

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=13658
ArticleTitle=IBM WebSphere Developer Technical Journal: WebSphere Enterprise Scheduler planning and administration guide
publish-date=04142004