Known issues and limitations
Details of the RTM known issues and limitations.
Issue | Description |
---|---|
New for 10.2.0 Fix pack 1 | |
Duplicate pending job records may appear in Job Detail page ( ) when filtered with Pend Level. | This occurs because DISTINCT (when selecting records from across tables) was removed from RTM to improve performance. |
The RTM Benchmark plugin does not work for LSF versions 10.1.0.10 or higher, if the RTM server host is not included in the LSF_ADDON_HOSTS on the LSF management host. | Since LSF 10.1.0 Fix Pack 10, operations as 'root' are rejected if LSF_ROOT_USER is 'N' or is not configured, with an exception if the RTM server host is included in the LSF_ADDON_HOSTS on the LSF management host. However, the RTM Benchmark plugin only works if run as 'root'. |
New for 10.2.0 | |
License daily status result will be empty when changing user/Host/Cluster filter combination if Disable license feature peak utilization calculation is enabled. | If the option “Disable license feature peak utilization calculation” option under . is selected, the page will not be accurate. When changing filter combinations, some combination will not have data. |
Cacti Aggregate graph function can not work, | See https://github.com/Cacti/cacti/issues/2869 for Cacti on github. |
Can not create threshold with data query item required, | See https://github.com/Cacti/plugin_thold/issues/317 for Cacti on github. |
Can not replace thold name to correct one and Warning/Alert HRULE description display incorrect, | See https://github.com/Cacti/plugin_thold/issues/357 for Cacti on github. |
Issues from 10.1.0 and older | |
The Rtmssh Plugin (controls the ssh action icon on | , , and pages ) does not work on some Firefox (64bit) browsers.For example, this feature does not work correctly on Firefox (64bit) version 52.6.0 because Firefox changed its support. See https://support.mozilla.org/en-US/kb/use-java-plugin-to-view-interactive-content for more information. |
The new Map feature can map a license feature name to another new name. However, the new name does not work in the | page and continues to show the old feature name.For example, if the map feature has changed myfeature1 to my1 in the | page, it will continue to show as myfeature1 in page while showing the new my1 name in other RTM pages.
If the host name contains a period, RTM cannot identify the domain name and gets an incorrectly shortened name with run database_shorten_hostname.sh | For example, if the host name is abc.dd (FQDNs is abc.dd.domain.com), the shorten hostname script will get the domain name as dd.domain.com, and host name as abc, not abc.dd. |
If the host name contains a period, RTM cannot identify the domain name and gets an incorrectly shortened name with run database_shorten_hostname.sh |
For example, if the host name is abc.dd (FQDNs is abc.dd.domain.com), the shorten hostname script will get the domain name as dd.domain.com, and host name as abc, not abc.dd. |
For two LSF single pending reasons, RTM is not able to fetch their customized descriptions |
Reason ID Default Description PEND_NO_CANDIDATE_HOST (62) There are no suitable hosts for the job PEND_JOB_SPREAD_TASK (312) Not enough hosts to meet the job's spanning requirement |
ALL cluster option in metadata settings does not work | |
Incorrect examples given for DiskU client poller installation | In the section titled "Setting up remote DiskU client pollers",
incorrect examples are given in step 3 of the procedure. It should appear as follows: Install the rtm-client-10.2.0.<ARCH>.rpm.:
Note: If you are installing the RTM client on RHEL
and SLES, use the following to
install:
If you are installing the RTM client on Ubuntu, use the following to install:
|
DiskU Plugin Limitation. An incorrect value for total users is shown in the By TagName page. | As there is no tagname level aggregation, (the value is SUM based on user level or disku_users_totals), the value may be lager than the actual value when there are the same users in the different scan path with the same tagname. |
Errors when upgrading RTM from previous version to 10.1 if LSF7.x cluster has been used. | Scenarios:
|
LS plugin Limitation - RTM License Scheduler plugin does not support multiple license in one token | After configuring License Scheduler to support multiple licenses in one token, the RTM License Scheduler plugin does not support this configuration. |
Benchmark job Limitation - No records found if the number of days selected is greater than the retention period. | If the set number of benchmark job history days is greater than Job Data Retention Period job, then after Job Data Retention Period job, the job information will be cleaned from grid_jobs and grid_jobs_finished table. Clicking the jobid on Benchmark job result page will show no record found. |
The daily_replay and other exit jobs aggregation reports will be broken as a result of changes to lsb.accnt:JOB_FINISH line format in LSF 10.1 | LSF 10.1 made changes to the lsb.acct:JOB_FINISH line format, merging "killed pending array job" as one line per kill action. It also made changes to the original jobFinishLog->idx as '0/-1'. Therefore, "gridacct" will insert only one job record for a serial of 'killed pending array job" with the incorrect 'indexid', and skip all other 'killed pending array job' in one line. If the LSF Admin enables the new JOB_FINISH format (lsb.params:JOB_ARRAY_EVENTS_COMBINE=Y), then the daily_replay and other exit jobs aggregation reports will be broken. Note: The JOB_ARRAY_EVENTS_COMBINE parameter is set to Y by default for fresh installations of LSF. |
When hosts are filtered by Resreq in the dashboard, an error "Hosts not found" is displayed. |
As a workaround, after selecting a cluster and filtering hosts by Resreq, add "Type=any" along with your search keyword in the Resreq field. For example, if you want to filter "1s=1", then enter "Type=any && 1s=1". |
When queues are filtered by User in the page, the Active Slots number is not equal to the sum of Run Slots, Pend Slots, and Suspend Slots. |
The slot numbers do not match when All is selected in the User field. As a workaround, filter by any user name instead of All to match the slot numbers. |
If the HPC Allocation feature is enabled, the starting and running tasks values are not consistent in the page. |
This is a limitation of LSF 9.1.3 as fairshare displays number of slots instead of number of tasks. |
The State Changes column shows incorrect values in the page. |
This is a limitation of RTM. The State Changes column may show an incorrect value in the | page for jobs that have been requeued.
LS Plugin Limitation - Inaccurate reserve token shown to users. |
If a user over reserves a token for a job, and is using consuming fewer tokens, RTM will accurately show a reserve amount, but the user may not be able to search for those reserving jobs. Therefore, this may cause several problems. For example, the Demand column shows an incorrect number in the tab. |
The job displayed in the JobIQ details page does not match counts on the RTM Summary Job History page. |
The limitation is due to the difference in polling times. Heuristics poller runs every 5 minutes but the job poller runs more frequently. So the count may be out of sync until heuristics poller runs again. |
Job efficiency is over 100% in the Job Details tab on the page.. |
This is a limitation of RTM. When a job is requeued and submitted to one of the specified queues, job efficiency shows over 100% in the Job Details tab. |
Issues with MySQL performance. |
Change innodb_flush_log_at_trx setting in my.cnf to 2. This dumps the MySQL log to disk every second instead of every query commit. This change reduces the amount of random disk I/O and thus increases RTM's scalability. |
An error ' |
This is a limitation of LSF on Power Systems because LSF does not collect MBD file descriptor usage metrics. |
Machine runs out of memory for running the Cluster Dashboard with auto refresh. |
Machine runs out of memory because the session files are overloading the worker threads memory.
As a workaround, change the settings in /etc/http/conf/httpd.conf
as:
If you want to tune the web server or recycle Apache processes, refer to http://www.hostinginside.com/billing/knowledgebase.php?action=displayarticle&id=4 |
All license-related alerts are no longer valid after upgrading RTM to 9.1.3. |
If you are upgrading to 9.1.3, then all license-related alerts must be re-created. |
SQL syntax error occurs when running RTM on RHEL 6.4 and 6.5 on IBM Power Systems. |
SQL syntax error occurs when Red Hat Bugzilla number 1054953 is not applied. Contact Red Hat OS Support to get the fix pack. |
No records are displayed under Syslog when a remote database is configured for RTM installed on SLES 11. |
This issue is due to the wrong syslog path. As a workaround, follow these steps:
|
If you have removed an ELIM value from LSF, it continues to be displayed in the various host tables in RTM. |
When an ELIM is removed from LSF, it continues to show up in the Save to refresh and remove the Elim. | page. Go to that page and click
No data is displayed when Previous Day is selected in License Daily Statistics Report. |
Data is not displayed if you navigate to Previous Day. As a workaround, select Yesterday instead of Previous Day to view data. | and filter as
Sometimes audible alert is not activated for the unavailable hosts. |
When you mute a triggered Unavailable Hosts alert, the mute is sustained for all other triggered alerts. Resume the dashboard alerts to trigger the audible alerts again. |
When a SQL query is modified, the layout items occasionally contains old SQL query. | When a SQL query in the data source is modified, the existing layout items change according to the new query. However, sometimes the existing layout items do not change even after modifying the SQL query. As a workaround, delete all existing layout items. The new layout items are then generated again according to the modified SQL query. |
If both Xz_string and xz_string are defined as shared resource, only the first one is taken into account. | RTM is not case-sensitive for non-binary string searches. If you search with Xz_string, then you get resources that start with either "A" or "a". |
Browser hangs on the host dashboard after the resource requirement value is entered. |
When LSF® LIM is down for a specific cluster, the Resource Requirement string filter does not work, and the page locks up until the LSF API times out. Restart your browser to correct this behavior and avoid by using the resource requirements filter when the cluster is offline. |
The error "Error:'1060', Message:'Duplicate column name 'jobid'" is displayed after using a cross join SQL query in a grid alert.. | When you define a grid alert by using a cross join SQL query, do not use "select * for the list of qualified column names. As a workaround, you can list the fields that you want to query in the SQL sentence after the word select. |
When a job triggers an alert, a notification is sent to only one of the defined email addresses. | If you submit jobs and assign multiple email addresses for alert notification, then the alert notification is sent only to the last email address. |
ssusp time differs between LSF accounting and RTM accounting | It is difficult to get ssusp time in IBM Spectrum LSF RTM if the total ususp time is less than the poller interval. A workaround is to decrease the poller interval but it may not apply to all due to system size. |
stime, utime, and mem rusage reports for finished jobs are not the same in RTM and LSF | For IBM® Platform LSF 7.0.2 and earlier versions, the stime, utime, and mem rusage reports for finished jobs in RTM and LSF do not match. |
License data filtering does not work when fields contain commas or quotation marks | Filtering does not work if the license server has commas in any of the filter fields. If the vendor name has a comma, it displays correctly on the detail page. However, if you try to filter by the vendor name, it removes all data and sets the vendor filter name back to "All". |
Rsyslog cannot start due to a missing module | The following error is scene in /var/log/messages when
starting rsyslogd .
Workaround:
|
Jobs running record status are shown as Exited and the job record is not found on the Job Graph/Job Detail page as the time zone of the lsfpoller in RTM Server is not adjusted to remote lsfpoller. | All remote pollers must be in the time zone of the cluster. The timezone of the Cluster is set in | .
Job graph is not drawn if RRD file’s last update time is greater than update time | The RRD files update times are based on the RTM hosts whereas the rusage update
times are based on the cluster. This inconsistency happens when the actual time is out of sync with
the LSF cluster. Follow these workaround steps:
|
Cannot forward syslog messages to the RTM host. | This message is displayed when the RTM host is using rsyslog
and the other host, which is sending messages is using syslog. To resolve this issue, edit/etc/rsyslog.conf by adding this line: :hostname, contains, "syslogd" |
When embedding graphs in Lotus Notes® email, an icon shows as a red X. | When a graph is embedded in an email, the icon shows as a red X. The graph is
attached to the email so you can view it, but it is not inline as expected. As a workaround:
|
Job time is not reported correctly for jobs with pre-execution scripts | If a job has a pre-execution script, LSF includes the time in the running value and RTM also includes this time in the pending value. |
Issues with requeued jobs |
|
Internet Explorer cannot handle URL with underscore ("_") | If you use the Internet Explorer (IE) browser to log in to a IBM Spectrum LSF RTM system that has an
underscore in the host name, you can enter the login and password but it does not proceed past the
Login page. This problem applies in both IE7 and IE8. As workaround, use a different alias for the host or it’s IP address in the URL. |
Existing host's graph is not updated after the host template's Associated Graph Templates or Associated Data Queries is changed | After the host template’s Associated Graph Templates/Associated Data Queries is changed, the Data Queries are not automatically added and are reindexed only. For now, Graph Templates are only updated after more than 10 minutes. |
Fixed bugs
Bugs fixed in each release of IBM Spectrum LSF RTM are listed in the Readme document available with the product download on IBM Fix Central (www.ibm.com/support/fixcentral).