What's new and changed in IBM Spectrum LSF RTM Version 10.2 Fix Pack 11
The following topics summarize the new and changed behavior in IBM Spectrum LSF RTM ("RTM") 10.2 Fix Pack 11.
Terminology
Old terminology | New terminology |
---|---|
master host | management host |
master batch daemon (mbatchd) | management batch daemon |
master LIM | management host LIM |
Some of these terminology changes are not reflected in the display fields.
Updates
- Cacti to 1.2.15, plus several patches, including (For details, refer to $RTM_TOP/cacti/CHANGELOG):
- Syslog plugin to 3.1
- Thold plugin to 1.5.2
- FusionCharts to 3.15.2
Updated Poller for LSF 10.1.0.7
RTM 10.2 Fix Pack 11 includes the following updates to the built-in Poller for LSF 10.1.0.7 to support the security enhancements in LSF 10.1.0 Fix Pack 11:
- Kerberos authentication in the LSF cluster
- New security communication mode in the LSF cluster (APAR#P103960)
Guarantee SLA and resource pool
RTM 10.2 Fix Pack 11 includes the following enhancements to support more guarantee SLA and resource pool information:
- Job SLA loaning information
- New security communication mode in the LSF cluster (APAR#P103960)
Updated LSF performance metric data poller
RTM 10.2 Fix Pack 11 includes an updated LSF performance metric data poller to support the following:
- Non-root operation for LSF 10.1.0.10, and newer.
- New performance metric: Scheduler Efficiency
Enhanced Heuristics plugin
RTM 10.2 Fix Pack 11 includes enhancements to the Heuristics plugin to support the following features:
- Export view data as a .csv file.
- Minimize/maximize/refresh per view.
- Cache and aggregate more job data.
Process changes for new installations
- Download and untar RTM 10.2.0.0 release from Passport Advantage.
- In same directory, untar RTM 10.2.0.11, replacing some key files.
- Install by running the following command:
./rtm_install.sh –f install.config
- Patch by running the following command:
./rtm_patch.sh
Fix Pack 11 deployment path from previous releases
- RHEL 7 x64 and ppc64le
- RHEL 8 x64 and ppc64le
- SLES 15/15SP1 for x64
- Ubuntu 18.04 for x64
Performance enhancements
RTM 10.2 Fix Pack 11 includes the following performance enhancements:
- New Max Insert SQL string length option for gridload/gridjobs/gridpend to improve WAN communication performance.
- Enhance host group, queue, guarantee SLA/resource pool graph data poller and web performance with ETL.
Additional enhancements
RTM 10.2 Fix Pack 11 includes the following additional enhancements:
- Change syslog initialization parameters to enable table partition by default.
- New License Service Dashboard.
- Host Dashboard supports the Extra Small icon
- New IBM Spectrum LSF RTM icon
- Add two new hooks to support customization action besides job action
- Use the mysql/mariadb option file for the database export action to avoid security issues.
Fixed issues
The following RFEs have been fixed in RTM 10.2 Fix Pack 11:
APAR / ID | Description |
---|---|
P103989 | Assigning a value to the IP column in the grid_clusters table causes the cluster configuration save to fail. |
P103713 | This fix modifies how the ss_grid_hgroup_stats.php script calculates the number of slots, and retains host information when the host status is abnormal. |
P103699 | The "Submit Command" field shows the LSF job command in a single line. |
P103716 | There are insufficient time window options for idle jobs. |
P103718 | The "exit code" column aligns left, which brings it too close to columns on the left. |
P103741 | The maximum GPU memory and GPU memory usage values are calculated incorrectly. |
P103763 | Recognize exclusive resources. |
P103765 | Provide utility script to remove duplicate job record cross partition tables |
P103775 | Users do not know what the "Default" number of rows are when selecting it from drop-down lists. |
P103777 | Filters are triggered immediately when 'Dynamic' search is disabled in the job details page. |
P103778 / RFE#146045 | A job name that is too long is truncated and cannot be seen in its entirety in the job details page. This fix introduces a tooltip that displays the longer job name (up to 119 characters) when hovering over the job name. |
P103791 | The RTM patch installer uses the user-installed version of php instead of /usr/bin/php. |
P103817 | The total number of GPUs reported is incorrect. |
P103818 | The GPU wall time in the RTM daily statistics report is not calculated properly. |
P103829 | The Job Info -> By Host page shows no graphs for hosts. |
P103843 | Add a new parameter 'max-query-time' to the database_kill.php script. |
P103851 | The database_replay_daily_stats.php script might make duplicated records. |
P103855 | The lmstat processes need to be properly cleaned up. |
P103858 | The 'CPU time variation for job' graphs are incomplete. |
P103863 | The gridpend core dumps if there are some pending job incuding suspend reason. |
P103884 | Trying to access the Host Graphs page fails with the error "YOU DO NOT HAVE RIGHTS FOR PREVIEW VIEW". |
P103885 | The grid_add_license_graphhs.php page uses the incorrect graph template id and data query Id. |
P103888 | The Daily Statistics and queue distrib page loads very slowly. |
P103907 | There are sorting issues for the following columns in the Host Info - Servers page: Max Mem, Max Swap, Max Temp. |
P103908 | The queue slot numbers are automatically converted to "K" units if there are more than 1000 slots. |
P103909 | The grid_job_daily_stats partition table does not require optimization. |
P103911 | Cannot open the second page of the "Current license usage" report. |
P103923 | Database backup utility grid_backup_restore_rtm.php uses the local /tmp directory, which can cause the host to run out of disk space. |
P103925 | The host's string resource value is always displayed as "-" in the Host page dashboard. |
P103926 | Enlarge some string fields for gridpend and gridjobs binary |
P103928 | RTM only removes one backup copy even there are multiple backups that are out of date. |
P103943 | When you navigate to a job ID, then go back to the Job Details page, the "JobID" appears in the filter. |
P103945 / RFE#145659 | The gridacct command does not load the mem_reserved or mem_requested columns. |
P103953 | In the alert email, the attached URL is missing the server name. |
P103956 | SQL 1064 inserting into grid_jobs_rusage. |
P103976 | The Cacti Host Mapping page is unusable when there are a large number of hosts. |
P103977 | The grid hueristics tables must be excluded from the main table backup. |
P104002 | Improve the performance of the grid_purge_device.php script. |