Tivoli Workload Scheduler LoadLeveler

This page has not been liked. Updated 10/30/13, 2:10 PM by ptrimbleTags: None
Known Issues

 

Date Added: October 30, 2013

Coexistence issues with LOADLEVELER

Warning:

Jobs that have actually completed may stay in R (running) state and not release their resources in a mixed cluster with nodes at LoadLeveler 5.1.0.15 service level with nodes at LoadLeveler service level 5.1.0.16 or 5.1.0.17.

A coexistence issue was introduced in LoadLeveler 5.1.0.16.

Users Affected:

Installations running LoadLeveler service level 5.1.0.15 with LoadLeveler service level 5.1.0.16 or 5.1.0.17.

Workaround:

The coexistence problem introduced in TWS LoadLeveler 5.1.0.16 cannot be corrected.

The entire LoadLeveler 5.1.0.15 cluster will need to be migrated to LoadLeveler service level 5.1.0.16 or 5.1.0.17 at the same time.

 

 

Date Added: September 18, 2013

Coexistence issues with LOADLEVELER

Warning:

Jobs will not be able to run in a mixed cluster with LoadLeveler 5.1.0.14 service level with any higher LoadLeveler service level.

A coexistence issue was introduced in LoadLeveler 5.1.0.15.

Users Affected:

Installations running mixed levels of LoadLeveler 5.1.0.14 with higher service levels of LoadLeveler.

Workaround:

The coexistence problem introduced in TWS LoadLeveler 5.1.0.15 can not be corrected.

The entire LoadLeveler 5.1.0.14 cluster will need to be migrated to a higher service level at the same time.

 

 

Date Added: March 27, 2013

Update for LoadLeveler execute directory

Effective with LL 5.1.0.6 for AIX and LL 5.1.0.7 for Linux:

LeadLeveler now permits its execute directory to be located in volatile storage, such as RAM disk, where the contents may not exist following the reboot of an execute node.

To avoid the potential performance impact of using a shared file system, configure the execute directory in either local disk, or in RAM disk.

 

 

Date Added: December 18, 2012

LoadLeveler 5.1.0.13 will not support PE Runtime Environment 1.1

Warning:

Do not install the LL 5.1.0.13 service update if you have PE Runtime Environment 1.1 installed

Users Affected:

Users with PE Runtime Environment 1.1.

Fix:

Install APAR IV33552 (available in LoadLeveler 5.1.0.14)

 

Date Added: August 1, 2012

LoadLeveler 5.1.0.10 may cause DB2 database to hang

Warning:

Do not install the LL 5.1.0.10 service update if you are using or planning to use a DB2 database for the LoadLeveler configuration on RHEL 6 on POWER systems

Users Affected:

Any POWER system with RHEL6 that has a LoadLeveler DB2 database configuration.

Fix:

Install LL 5.1.0.11 when available. If you have already installed LL 5.1.0.10, contact IBM Service for an emergency fix package for IV25574.

 

Date Added: June 15, 2012

LoadLeveler 5.1.0.7 configuration service upgrade

Warning:

Do not install LL 5.1.0.7 service update if you are using or planning to use a database for the LoadLeveler configuration.

Users Affected:

Any system that has or will have a LoadLeveler database configuration.

Fix:

If running with LL 5.1.0.7 with database configuration then the database will need to be reinitialized when updating to next service level.

 

Date Added: November 4, 2011

LoadLeveler startd daemon might abort during startup

Warning:

The LoadLeveler startd daemon might abort during startup if there are files for terminated jobs left in the execute directory.

Users Affected:

Any system installed with IV08342 running the startd daemon.

Fix:

Apply apar IV10161 emergency fix package from IBM service.

 

Date Added: October 18, 2011

LoadLeveler 3.5.1.12 schedd daemon might core dump

Warning:

The LoadLeveler schedd daemon might core dump on systems at the LoadLeveler 3.5.1.12 service level. LoadLeveler may not restart on this node after the schedd crashes.

Users Affected:

Any system installed with LoadLeveler 3.5.1.12 service level running the schedd daemon. Systems with checkpoint enabled may have a higher occurrence rate.

Fix:

Install LoadLeveler 3.5.1.13 or later service level.

If this is not possible, apply apar IV03346 emergency fix package from IBM service.

 

Date Updated: February 22, 2010

Updated version: Add in TWS LoadLeveler 4.1.0.3

Date Added: February 17, 2010

TWS LoadLeveler 3.5.1.4, 4.1.0.2 and 4.1.0.3 -Jobs will not be started in a login shell

Warning:

In TWS LoadLeveler 3.5.1.4, 4.1.0.2 and 4.1.0.3, jobs will not be started in a login shell.

The environment in which the job runs may not be set as expected and the job may fail to run.

Users Affected:

All TWS LoadLeveler 3.5.1.4, 4.1.0.2 and 4.1.0.3 installations that need the proper environment set by a login shell for their jobs to run correctly.

Workaround:

Set the environment keyword in the job command file to COPY_ALL

e.g.

#@ environment = COPY_ALL

EFIX:

For TWS LoadL 3.5.1.4 - Apply apar IZ70280 emergency fix package available from IBM service.

For TWS LoadL 4.1.0.2 and 4.1.0.3 - Apply apar IZ70442 emergency fix package available from IBM service.

 

Date Added: November 20, 2009

TWS LoadLeveler 3.5.1.3 Startd daemon core dumps when starting up in drain mode

Warning:

The LoadL_startd daemon will core dump if started via drain mode under TWS LoadLeveler version 3.5.1.3.

Users Affected:

All TWS LoadLeveler 3.5.1.3 installations that uses the command option " llctl start drained " to start up LoadLeveler.

Workaround:

Start TWS Loadleveler via normal mode (do not specify drained option).

EFIX:

Apply apar IZ64435 emergency fix package available from IBM service.

 

 

Date Added: May 18, 2009

Coexistence issues with TWS LOADLEVELER

Warning:

Jobs will not be able to run in a mixed cluster with TWS LoadLeveler 3.5.0.1 - 3.5.0.4 service levels with either TWS LoadLeveler 3.5.0.5 or TWS LoadLeveler 3.5.1.1.

A coexistence issue was introduced in TWS LoadLeveler 3.5.0.5 which also affected TWS LoadLeveler 3.5.1.1.

Users Affected:

Installations running mixed levels of TWS LoadLeveler 3.5.0.1 - 3.5.0.4 with either TWS LoadLeveler 3.5.0.5 or TWS LoadLeveler 3.5.1.1.

Workaround:

The coexistence problem introduced in TWS LoadLeveler 3.5.0.5 can not be corrected.

The entire cluster will need to be migrated to either TWS LoadLeveler 3.5.0.5 or TWS LoadLeveler 3.5.1.1 at the same time.

There is no coexistence issue between TWS LoadLeveler 3.5.0.5 and TWS LoadLeveler 3.5.1.1.

 

 

Date Added: March 16, 2009

TWS LoadLeveler Service Update 3.5.0.4 for LINUX is available

Warning:

On linux platforms with multiple cpus, it is possible for the seteuid function to malfunction.

When the LoadLeveler startd daemon encounters this failure, its effective user id may be set incorrectly, in which case it is possible for jobs to become stuck in ST state.

Users Affected:

All Multiprocessor (or multicore) systems on LINUX.

Workaround:

To clear the jobs which are stuck in ST state, recycle the node that the job is pending on, using the command "llctl recycle".

Users need to include the keyword "#@ restart = yes" in their job command file so that the pending jobs which are terminated as the result of LL recycling will be restarted.

EFIX:

APAR IZ46123 is available from TWS LoadLeveler to workaround the glibc issue from IBM service.

 

 

Date Added: January 26, 2009

TWS LoadLeveler 3.5.0.2 causes migration and coexistence failures

An error was introduced in TWS LoadLeveler 3.5.0.2 where job objects used by TWS LoadLeveler 3.5.0.2 are incompatible with job objects used by all prior LoadLeveler maintenance levels or releases. LoadLeveler job objects are stored in the job spool and are transmitted among LoadLeveler processes. The incompatibility causes migration and coexistence failures such as the inability to read job objects, produced by earlier TWS LoadLeveler maintenance levels or releases, from the job spool.

Users Affected:

Systems with TWS LoadLeveler 3.5.0.2 installed

FIX:

Install TWS LoadLeveler 3.5.0.3.

TWS LoadLeveler 3.5.0.3 restores compatibility with maintenance levels and releases prior to LoadLeveler 3.5.0.2. LoadLeveler 3.5.0.2 will remain incompatible with prior maintenance levels and releases, and will also be incompatible with subsequent maintenance levels and releases.

Systems with TWS LoadLeveler 3.5.0.2 already installed will need to make sure to have an empty job queue before going to any LoadLeveler maintenance levels or releases; otherwise, the jobs in the job queue will be removed after the upgrade.

 

 

Date Added: November 21, 2008

Multiple problems in the negotiator in TWS LoadLeveler 3.4.3.5

The negotiator is frequently core dumping with a signal 6 when using user_priority in the job command file.

llq is showing the incorrect job state after central manager restarts.

Interactive job fails to run.

Users Affected:

Systems with TWS LoadLeveler 3.4.3.5 installed

EFIX:

APAR: IZ37213

DESCRIPTION:

The negotiator abort was due to the way the user_priority job command file keyword is implemented

internally to honor the user's assignment of job priority.

Some cases were found where inconsistencies in internal data structures could occur.

APAR: IZ38238

DESCRIPTION:

After a central manager restarts, running jobs are displayed as IDLE even though they are actually running.

APAR: IZ38253

DESCRIPTION:

The central manager is skipping over the newly arrived interactive step so it will not run.

Efixes are available from IBM service for AIX and LINUX platforms.

 

Date Added: October 29, 2007

TWS LL - TWS LoadLeveler Service Update 3.4.2.1 for AIX 5L and Linux is available.

Notes:

-TWS LoadLeveler 3.4.2.1 is a mandatory service update to be installed with TWS LoadLeveler 3.4.2.0.

-The TWS LoadLeveler scheduling affinity support has been enhanced to utilize the performance benefits from SMT processor core topology available on SMT-capable IBM POWER5 or POWER6 processor-based systems. Jobs can request TWS LoadLeveler to schedule and attach CPUs for their tasks to processor cores in addition to MCMs. Tasks of jobs requesting MCM task affinity share the processors in an MCM with other tasks in the same job. In some instances, this may cause two tasks to run on the same core (in separate SMT threads). If the application intends to have each task running on a distinct core, then the "task_affinity=core" keyword should be added to the job's JCF.

-Additional information relating to this update can be found at http://www14.software.ibm.com/webapp/set2/sas/f/loadleveler

 

Date Added: August 20, 2007

TWS LL - An error was introduced in TWS LoadLeveler 3.4.1.2 which causes TWS LoadLeveler to lose reservation and fair share information following a recycle of TWS LoadLeveler nodes running the schedd daemon.

Users Affected:

All installations that use the reservation or fair share features of TWS LoadLeveler.

Issue:

The error causes reservation and fair share data to be unretrievable from the corresponding spool file when the TWS LoadLeveler schedd daemon is re-started.

Solution:

Apply apar IZ03334 efix.

Efix rpms for all linux platforms supported by TWS LoadLeveler and an emergency package for AIX may be obtained by calling IBM service.

 

 

Date Added: June 20, 2007

TWS LL - An error was introduced in all 32 bit linux ports of TWS LoadLeveler 3.4.1.1 which can cause TWS LoadLeveler jobs to fail.

Users Affected:

All installations that use 32 bit linux ports of TWS LoadLeveler 3.4.1.1.

Issue:

The error causes user process limits to be set to uninitialized values, which can cause application failures.

When the uninitialized value is very small, the user application may terminate abnormally due to exceeding the small limit.

Solution:

Apply apar IZ00385 efix rpms which are available from IBM service for the following platforms:

x86_redhat_4.0.0

x86_redhat_3.0.0

x86_sles_10.0.0

x86_sles_9.0.0

 

Date Added: April 04, 2007

The "smt" keyword is defaulted to "no" under LoadLeveler 3.4.0.0 and 3.4.0.1.

The "smt" keyword is defaulted to "no" under LoadLeveler 3.4.0.0 and 3.4.0.1.

This may cause jobs to have degraded performance if SMT is enabled on machines in

the LoadLeveler cluster.

For more information go to:

LL 3.4.0.1 defaults to "SMT=no"