What's new and changed in IBM Spectrum LSF RTM Version 10.2 Fix Pack 11

The following topics summarize the new and changed behavior in IBM Spectrum LSF RTM ("RTM") 10.2 Fix Pack 11.

Terminology

RTM now has the following terminology changes to the following components:
Table 1. Terminology changes
Old terminology New terminology
master host management host
master batch daemon (mbatchd) management batch daemon
master LIM management host LIM

Some of these terminology changes are not reflected in the display fields.

Updates

  • Cacti to 1.2.15, plus several patches, including (For details, refer to $RTM_TOP/cacti/CHANGELOG):
  • Syslog plug-in to 3.1
  • Thold plug-in to 1.5.2
  • FusionCharts to 3.15.2

Updated Poller for LSF 10.1.0.7

RTM 10.2 Fix Pack 11 includes the following updates to the built-in Poller for LSF 10.1.0.7 to support the security enhancements in LSF 10.1.0 Fix Pack 11:

  • Kerberos authentication in the LSF cluster
  • New security communication mode in the LSF cluster (APAR#P103960)

Guarantee SLA and resource pool

RTM 10.2 Fix Pack 11 includes the following enhancements to support more guarantee SLA and resource pool information:

  • Job SLA loaning information
  • New security communication mode in the LSF cluster (APAR#P103960)

Updated LSF performance metric data poller

RTM 10.2 Fix Pack 11 includes an updated LSF performance metric data poller to support the following:

  • Non-root operation for LSF 10.1.0.10, and newer.
  • New performance metric: Scheduler Efficiency

Enhanced Heuristics plug-in

RTM 10.2 Fix Pack 11 includes enhancements to the Heuristics plug-in to support the following features:

  • Export view data as a .csv file.
  • Minimize/maximize/refresh per view.
  • Cache and aggregate more job data.

Process changes for new installations

For new installations of RTM 10.2 and Fix Pack 10.2.0.11, complete the following steps:
  1. Download and decompress RTM 10.2.0.0 release from IBM Passport Advantage.
  2. In same directory, decompress RTM 10.2.0.11, replacing some key files.
  3. Install by running the following command:
    ./rtm_install.sh –f install.config
  4. Patch by running the following command:
    ./rtm_patch.sh

Fix Pack 11 deployment path from previous releases

RTM 10.2 Fix Pack 11 deployment on top of 10.2 and 10.2.0.1 is supported for the following operating systems:
  • RHEL 7 x64 and ppc64le
  • RHEL 8 x64 and ppc64le
  • SLES 15/15SP1 for x64
  • Ubuntu 18.04 for x64

Performance enhancements

RTM 10.2 Fix Pack 11 includes the following performance enhancements:

  • New Max Insert SQL string length option for gridload, gridjobs, or gridpend to improve WAN communication performance.
  • Enhance host group, queue, guarantee SLA and resource pool graph data poller and web performance with ETL.

Additional enhancements

RTM 10.2 Fix Pack 11 includes the following additional enhancements:

  • Change syslog initialization parameters to enable table partition by default.
  • New License Service Dashboard.
  • Host Dashboard supports the Extra Small icon
  • New IBM Spectrum LSF RTM icon
  • Add two new hooks to support customization action besides job action
  • Use the mysql/mariadb option file for the database export action to avoid security issues.

Fixed issues

The following enhancements have been fixed in RTM 10.2 Fix Pack 11:

Table 2. Issues fixed in RTM 10.2 Fix Pack 11
APAR or ID Description
P103989 Assigning a value to the IP column in the grid_clusters table causes the cluster configuration save to fail.
P103713 This fix modifies how the ss_grid_hgroup_stats.php script calculates the number of slots, and retains host information when the host status is abnormal.
P103699 The Submit Command field shows the LSF job command in a single line.
P103716 There are insufficient time window options for idle jobs.
P103718 The exit code column aligns left, which brings it too close to columns on the left.
P103741 The maximum GPU memory and GPU memory usage values are calculated incorrectly.
P103763 Recognize exclusive resources, and separate exclusive resources from the resources column to the excl_resources column in the grid_hostinfo table.
P103765 Provide utility script to remove duplicate job record cross partition tables.
P103775 Users do not know what the Default number of rows are when selecting it from drop-down lists.
P103777 Filters are triggered immediately when Dynamic search is disabled in the job details page.
P103778 / RFE#146045 A job name that is too long is truncated and cannot be seen in its entirety in the job details page. This fix introduces a tooltip that displays the longer job name (up to 119 characters) when hovering over the job name.
P103791 The RTM fix installer uses the user-installed version of php instead of /usr/bin/php.
P103817 The total number of GPUs reported is incorrect.
P103818 The GPU wall time in the RTM daily statistics report is not calculated properly.
P103829 The Job Info > By Host page shows no graphs for hosts.
P103843 Add a new max-query-timeparameter to the database_kill.php script.
P103851 The database_replay_daily_stats.php script might make duplicated records.
P103855 The lmstat processes need to be properly cleaned up.
P103858 The CPU time variation for job graphs are incomplete.
P103863 The gridpend binary core dumps if there are some pending jobs that include a suspension reason.
P103884 Trying to access the Host Graphs page fails with the error "YOU DO NOT HAVE RIGHTS FOR PREVIEW VIEW".
P103885 The grid_add_license_graphhs.php page uses the incorrect graph template id and data query ID.
P103888 The Daily Statistics and queue distrib page loads very slowly.
P103907 There are sorting issues for the following columns in the Host Info - Servers page: Max Mem, Max Swap, Max Temp.
P103908 The queue slot numbers are automatically converted to K units if there are more than 1000 slots.
P103909 The grid_job_daily_stats partition table does not require optimization.
P103911 Cannot open the second page of the Current license usage report.
P103923 Database backup utility grid_backup_restore_rtm.php uses the local /tmp directory, which can cause the host to run out of disk space.
P103925 The host's string resource value is always displayed as a dash (-) in the Host page dashboard.
P103926 Enlarge some string fields for gridpend and gridjobs binaries.
P103928 RTM only removes one backup copy even there are multiple backups that are out of date.
P103943 When you navigate to a job ID, then go back to the Job Details page, the JobID appears in the filter.
P103945 / RFE#145659 The gridacct command does not load the mem_reserved or mem_requested columns.
P103953 In the alert email, the attached URL is missing the server name.
P103956 SQL 1064 inserting into grid_jobs_rusage.
P103976 The Cacti Host Mapping page is unusable when there are a large number of hosts.
P103977 The grid heuristics tables must be excluded from the main table backup.
P104002 Improve the performance of the grid_purge_device.php script.