IBM Support

Readme and Release notes for release 4.1.1.6 Tivoli Workload Scheduler LoadLeveler (LL) 4.1.1.6 LL_resmgr-4.1.1.6-power-AIX Readme

Fix Readme


Abstract

xxx

Content

Readme file for: LL_resmgr-4.1.1.6-power-AIX
Product/Component Release: 4.1.1.6
Update Name: LL_resmgr-4.1.1.6-power-AIX
Fix ID: LL_resmgr-4.1.1.6-power-AIX
Publication Date: 1 September 2011
Last modified date: 1 September 2011

Installation information

Download location

Below is a list of components, platforms, and file names that apply to this Readme file.

Fix Download for AIX

Product/Component Name: Platform: Fix:
Tivoli Workload Scheduler LoadLeveler (LL) AIX 5.3
AIX 6.1
LL_resmgr-4.1.1.6-power-AIX

Prerequisites and co-requisites

None

Known issues

Known limitations

  • - Known Limitations

    For LL 4.1.1:

    • On any machine where you plan to install the scheduler rpm, you must install the resource manager rpm and the resource manager rpm must be installed before the scheduler rpm.
    • If the scheduler and resource manager filesets are installed on the same machine, then the LoadLeveler version of those filesets have to be at the same level in order for transactions to be processed correctly.

Installation information

  • - Installation procedure

    Install the LoadLeveler updates on your system by using the normal, smit update_all command.

    For further information, consult the LoadLeveler Library for the appropriate version of the LoadLeveler AIX Installation Guide.

Additional information

  • - Package contents

    LoadL.resmgr.full.bff| 4.1.1.6
    LoadL.resmgr.msg.en_US.bff| 4.1.1.3

  • - Changelog

    Notes

    Unless specifically noted otherwise, this history of problems fixed for LoadLeveler 4.1.1.x applies to:

    • LoadLeveler 4.1.1.x for AIX 6
    • LoadLeveler 4.1.1.x for AIX 5
    • LoadLeveler 4.1.1.x for SUSE LINUX Enterprise Server 11 (SLES11) on POWER servers
    • LoadLeveler 4.1.1.x for SUSE LINUX Enterprise Server 11 (SLES11) on servers with 64-bit Opteron or EM64T processors
    • LoadLeveler 4.1.1.x for Red Hat Enterprise Linux 6 (RHEL6) on POWER servers
    • LoadLeveler 4.1.1.x for Red Hat Enterprise Linux 6 (RHEL6) on servers with 64-bit Opteron or EM64T processors
    • LoadLeveler 4.1.1.x for Red Hat Enterprise Linux 5 (RHEL5) on servers with 64-bit Opteron or EM64T processors
    • LoadLeveler 4.1.1.x for Red Hat Enterprise Linux 5 (RHEL5) on Intel based servers
    Restriction section

    For LL 4.1.1:

    • If the scheduler and resource manager components on the same machine are not at the same level, the daemons will not start up.

    General LoadLeveler problems

    Problems fixed in LoadLeveler 4.1.1.6 [September 1, 2011]

    • LoadLeveler can now handle jobs from users who belong to more than 64 system groups.
    • LoadLeveler is now able to support ETHoIB using bond0 interface mapped to IB User Space device on linux system if the fileset rsct.lapi.rte apar IV06393 is also applied.

    Problems fixed in LoadLeveler 4.1.1.5 [July 28, 2011]

    • Multiple configuration editor and form-based GUI issues are resolved.
    • LoadLeveler will not submit the job if there are no class in the default class list that can satisfy the job requirements.
    • LoadLeveler now creates cpuset files with permissions that are searchable by non-root users under the /dev/cpuset directory.
    • The unthread_open() error in the Schedd Log will no longer be printed when querying the remote cluster job since LoadLeveler will no longer try to route a nonexistent remote submit job command file in a multi cluster environment.
    • LoadLeveler has been enhanced so it now displays the job eligibility time.
    • Intel MPI and Open MPI are now supported under LoadLeveler.
    • Resource Manager only:
      • The llctl command is now able to support "start drained" option on the remote node.
    • Scheduler only:
      • LoadLeveler LoadL_negotiator daemon will not core dump when processing a multi-step job which contains a dependency statement longer than 2048 character s.
      • LoadLeveler "llq -s" command will provide information about why a step is in Deferred state.
      • The llsummary command and API will no longer core dump if the number of history files are greater than or equal to the PTHREAD_DATAKEYS_MAX constant value.

    Problems fixed in LoadLeveler 4.1.1.4 [May 27, 2011]

    • The llctl command will now check to make sure the Schedd daemon's port is available to be used before starting up LoadLeveler.
    • A new keyword, restart, is implemented for the class stanza in the admin configuration.
    • If a value is not set for the keyword max_starters in database configuration mode, the default value used for max_starters will be adjusted when the count of classes specified in the keyword class is changed.
    • Absolute paths containing http/https are changed to relative paths for the configuration editor to run.
    • Resource Manager only:
      • Loadleveler will now set the right environment variables when executing the user epilog script.
      • The llmkres command should now be able to create the reservations consistently without hitting the timing error message 2512-856.
      • In a multicluster environment, the llq -s command will now invoke the correct query command on the remote cluster.
    • Scheduler only:
      • Modifying the recurring reservation's attribute will now be seen in the first occurrence's attribute value under the llqres -l command.
      • A unique security issue has been identified for TWS LoadLeveler Web User Interface that could potentially compromise your system. It is recommended that you apply this update to protect your system.

    Problems fixed in LoadLeveler 4.1.1.3 [March 25, 2011]

    • The ability to set the name_server in LoadLeveler is now disabled. The setting under LoadLeveler will now always be set to DNS.
    • When configuring class limits using the config editor adding or updating when there is more than one class limit will fail. Now the config editor can be used to update class limits or add new hard and soft limits.
    • If the class-user sub-stanzas in the "default" class stanza are not defined in alphabetical order, the class-user sub-stanzas might incorrectly inherit the wrong values from the default class. LoadLeveler will now inherit the default values for the class-user sub-stanzas from the "default" class stanza correctly.
    • On Linux/P nodes, jobs requesting memory affinity with MCM_MEM_NONE, the job will always consume memory from the local MCM and will start paging once memory on the local MCM is over consumed; even though memory is available on other MCMs on the node. Now, if a job is submitted with memory affinity option, MCM_MEM_NONE, the task will be bounded to all the MCMs on the node and the memory will be consumed from all the MCMs on the node.
    • An incorrect spelling prevented the class stanza keyword striping_with_minimum_networks from being set when DB configuration was used. The spelling of the column name in the database is now corrected.
    • LoadLeveler schedd may ignore jobs if the job queue contains invalid job keys. Now, LoadLeveler schedd will collect the correct job data when scanning the job queue files.
    • The llrstatus -a reports "No adapters are available" after issuing the llrctl reconfig command. When a machine running a Resource Manager or Region Manager daemon is reconfigured, information about adapters on other machines was being wiped out. The configuration processing code in Resource Manager and Region Manager has been fixed so that existing adapter information will remain intact.
    • When configuring the resources=keyword(all) in the machine group stanza in database mode, the llstatus -R command will show no resources being set. Resources will now become effective when setting the resources=keyword(all) in the machine group stanza in database mode.
    • The schedd can core dump when a scale-across multi-cluster environment is configured incorrectly. This can happen if scale-across multi-cluster is configured and the same cluster stanza is specified as local for more than one cluster. LoadLeveler is changed to protect the schedd from core dumping the same cluster stanza is configured as local for more than one cluster in a scale-across multi-cluster environment.
    • Resource Manager only:
      • The LoadL_startd daemon leaks memory due to a failure to release memory allocated for data structures to hold switch table data for a job step. The LoadL_startd daemon is corrected to release all memory allocated for data structures to hold switch table data for a job step, when the job step data structure is de-allocated.
      • A crash may occur in either the resource manager daemon or the negotiator daemon if those daemons received incorrect routing data during an update from startd. This could have happened when the feature keyword was used in the machine_group stanza under the administration file or database setup. The correct bits are now set by the startd daemon so that routing of the data will not cause the resource manager or the central manager to core dump.
    • Scheduler only:
      • LoadLeveler will occasionally show the wrong number of class resource slots or even miss some classes from the llclass output if too many class query requests come in simultaneously. LoadLeveler is now fixed to show the correct class resources in the llclass output.
      • When maxidle is used for a given user within a class, dependent steps can be queued at a higher priority than non-dependent steps. Dependent steps are not given a new qdate when they are put onto the idle queue, while steps at the maxidle limit for a given user within a class are given a new qdate and a new sysprio based on that qdate. A change was made so that dependent steps are also counted as "queued" steps for the purposes of enforcing maxqueued and maxidle limits, and so a dependent step which is at the maxidle limit will get a new qdate.

    Problems fixed in LoadLeveler 4.1.1.2 [January 28, 2011]

    • The LoadLeveler startd drain status will be lost if the negotiator daemon restarts. Fixed the startd drain status to be stored onto each individual startds. When the negotiator daemon restarts, the startd drain information will be restored from all the startds.
    • The llsummary command might crash if the default class requirement value doesn't match the job requirement value. Fixed the llsummary command to select the correct requirement value from the default class list if there is no job class specified in job command file.
    • The llsummary command will fail when it tries to access invalid data memory in the job history file. Fixed the llsummary command to be able to ignore the bad data areas and just report the valid data in the job history file.
    • LoadLeveler schedd may ignore jobs if the job queue contains invalid job keys. The schedd will now collect the correct job data when scanning the job queue files.
    • The LoadLeveler command, llmodify, has a limitation where the startdate and wall_clock_limit job attributes cannot be modify for idle jobs. llmodify is now enhanced to be able to modify the startdate and wall_clock_limit job attributes for idle jobs.
        New documentation:
      • In the LoadLeveler Command and API Reference, SC23-6701-00, under Chapter 1. Commands, llmodify - Change attributes of a submitted job step,
        • New keyword wall_clock_limit for the -k option: Changes the wall clock limit of a job step. The value of the specified wall clock limit must be longer than the value of the current wall clock limit. This is a LoadLeveler administrator only option.
        • New keyword startdate for the -k option: Changes the start time of a idle-like job step. This is a LoadLeveler administrator only option.
    • Resource Manager only:
      • User jobs will not be launched on AIX if the group name did not match the one from the job submission. Fixed the job launch program so it d oes not need to verify the group name so jobs will be executed using the submitting GID number.
      • A crash may occur in either the LoadL_resource_mgr daemon or the LoadL_negotiator daemon when the feature keyword is used in the machine_group stanza under the administration file or database setup. A fix has been made in supporting the specifying of the feature keyword in the machine_group stanza in the administration file or in the database.
    • Scheduler only:
      • LoadLeveler machines and jobs may have the wrong state if some startd are down and the region manager is enabled. Fixed LoadLeveler to handle machines and jobs status correctly when the region manager detects a machine to be down.
      • LoadLeveler was trying to load the network table for jobs with job_type=MPICH and the job will fail to run if the network table can not be loaded. Since jobs with a job_type=MPICH do not require the loading of the network table. LoadLeveler will not load the network table with this job type specified in the job command file.

    Problems fixed in LoadLeveler 4.1.1.1 [December 10, 2010]

    • Fixed the schedd daemon so it will not crash if the job's output file path contains the "%" character.
    • Fixed the -s and -e options in the llsummary command to report all the jobs that match the filter requirement. In the TWS LoadLeveler documentation, Command and API Reference and the llsummary.l manual page, the -s and -e options will state the accounting data report will contain information about every job that contains at least one step that falls within the specified range.
    • Resource Manager only:
      • Fixed the processor affinity environment to be setup correctly for jobs to run in when the job prolog is configured in LoadLeveler.
    • Scheduler only:
      • Fixed the llclass command to show the correct value for the "Free Slots" field when LoadLeveler is configured to use the LL_DEFAULT scheduler.
      • Fixed the llchres command to check requested node additions to make sure that those nodes have no jobs running on them or already assigned to another reservation. If no idle nodes can be found, the llchres command will fail.
      • Fixed LoadLeveler to correctly reserve the reservation's resources after the central manager daemon restarts so that jobs with overlapping resources with the reservations will not be allowed to start.
      • Fixed the central manager to make sure pending status changes to the machines are properly locked so that jobs being scheduled to the down machines will no longer crash the central manager daemon.

    TWS LoadLeveler Corrective Fix listing
    Fix Level APAR numbers
    LL 4.1.1.6 AIX : resource manager: IV06462 IV06510 IV06512
      Linux : resource manager:
    LL 4.1.1.6 AIX : scheduler: IV06463 IV06511 IV06513
      Linux : scheduler:
    LL 4.1.1.5 AIX : resource manager: IV00833 IV01116 IV01321 IV01332 IV02937 IV03232 IV03277 IV03299 IV03304
      Linux : resource manager:
    LL 4.1.1.5 AIX : scheduler: IV00813 IV00834 IV01325 IV01333 IV01390 IV02945 IV03233 IV03278 IV03300 IV03303 IV03305 IV03309
      Linux : scheduler: IZ99118 IV01036
    LL 4.1.1.4 AIX : resource manager: IV00027 IV00031 IV00037 IV00277 IV00304 IV00462 IZ93228 IZ99666
      Linux : resource manager:
    LL 4.1.1.4 AIX : scheduler: IV00028 IV00029 IV00032 IV00280 IV00299 IV00463
      Linux : scheduler:
    LL 4.1.1.3 AIX : resource manager: IZ93225 IZ93259 IZ93267 IZ94345 IZ94801 IZ96421 IZ96428 IZ96430
      Linux : resource manager:
    LL 4.1.1.3 AIX : scheduler: IZ89344 IZ93154 IZ93226 IZ94800 IZ96422 IZ96423 IZ96425 IZ96431 IZ96433
      Linux : scheduler:
    LL 4.1.1.2 AIX : resource manager: IZ89829 IZ90705 IZ90707 IZ91597 IZ91599 IZ92052 IZ92374
      Linux : resource manager: IZ91715
    LL 4.1.1.2 AIX : scheduler: IZ90487 IZ90706 IZ90708 IZ90875 IZ91596 IZ91598 IZ91600 IZ92375
      Linux : scheduler:
    LL 4.1.1.1 AIX : resource manager: IZ88502 IZ88504 IZ89257 IZ89260
      Linux : resource manager:
    LL 4.1.1.1 AIX : scheduler: IZ88503 IZ88506 IZ88509 IZ88511 IZ88513 IZ88681 IZ89259
      Linux : scheduler:

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SGS8DD","label":"Tivoli Workload Scheduler LoadLeveler for AIX"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
01 September 2011

UID

isg400000739