Grid Management section

The Grid Management section is in the Console tab.

Pollers page

Go to the Pollers page by clicking Pollers under the Grid Management section of the Console tab. Information about RTM pollers is shown. These pollers collect information from the LSF cluster; RTM uses this data to build various reports for RTM users and administrators.

  • Add. Add a poller. Click to open the RTM Poller Edit [new] page and specify the properties of the new poller.

  • Poller Name. The defined name for the poller. Click a name to open the RTM Poller Edit page and edit poller properties (for example, the poller name, LSF® version, bin directory location, poller location, and support information).

  • Poller ID. The ID assigned to the poller.

  • Status. The status of the poller. Statuses can be Up, Down, or N/A.

  • LSF Version. The LSF version that is running on the associated cluster.

  • Physical Location. The physical directory location of the local RTM poller (for example, /opt/rtm/lsfversion/bin). If the directory is found and verified, the message [OK: DIR FOUND]is displayed below this field.

  • Support Information. Enter a text string description for the location of the data poller (for example, “Data Center“).

  • Choose an action. Select one or more checkboxes to manage the pollers (for example, Delete), then select an action and click go.

Clusters page

Go to the Cluster page by clicking Clusters under the Grid Management section of the Console tab. Information about LSF clusters (including configured time-out thresholds, and job efficiency information) is shown and the pollers that collect data from them.

For more information about configuring a cluster with RTM, see Add or edit LSF clusters for RTM to monitor.

  • Add. Add a cluster for RTM to monitor. Click to open the Cluster Edit [new] page and specify the properties of the cluster.

  • Cluster Name. The defined name for the cluster. Click a name to open the Cluster Edit page and edit cluster properties, defaults, and various collection settings.

  • Cluster ID. The ID assigned to the cluster.

  • Poller Name. The name of the poller that is associated with this cluster.

  • Collect Status. The current data collection status for this cluster. Status can be Disabled, Up, Jobs Down, Down, Diminished, Admin Down, and Maintenance.

    • Maintenance. Indicates that RTM is conducting Database maintenance.

    • Admin Down. This status is displayed if you cleared the Should Daemons be Enabled option in the General Settings tab for the Poller.

    • Up. Indicates that there are queues, hosts, and loads that are collected within 3*non-job-interval and job information that is collected within 3*job-interval. Options are set in the Max Allowed Runtime for Queue/Host/Load Collection Settings and Job Collection Settings for the Poller.

    • Down. Indicates that there are no queues, hosts, and loads that are collected within 3*non-job-interval and job information collected within3* job-interval.

    • Jobs Down. Indicates that only the job information was not collected within 3*job-interval.

    • Diminished. Indicates that: One or several of queues, jobs, hosts, and loads information are collected but not all of them are collected within the related intervals. Jobs collection is related to job-interval while queues/hosts/loads information are related to non-job-interval.

      This status can also indicate that the collection of jobs, queues, hosts, and loads information never started.

    If you see No data shown above each status indicates no data.

  • Efic Status. An indicator of job efficiency within this cluster, based on configured thresholds. Status can be OK, Recovering, Warn, Alarm, and N/A. Thresholds are set from Console > Configuration > Grid Settings > Status/Events.

  • Efic Percent. An indicator of the average efficiency of running jobs within the cluster, reported as a percentage. The minimum runtime setting can be set from Console > Configuration > Grid Settings > Status/Events.

  • Total Hosts. The total number of hosts in this cluster.

  • Total CPUs. The total number of processors in this cluster.

  • Total Clients. The total number of clients in this cluster.

  • Collect Freq. The configured data collection frequency.

  • Collect Timeout. The configured data collection timeout.

  • Job Minor Freq. The configured job minor frequency.

  • Job Major Freq. The configured job major frequency.

  • Job Timeout. The configured job timeout.

  • LIM Timeout. The configured lim timeout.

  • Choose an action. Select one or more check boxes to manage the clusters (for example, Enable or Disable), then select an action and click go.

Utilities page

Go to the Utilities page by clicking Grid Utilities under the Utilities section of the Console tab. Information about RTM utilities as related to database administration (such as data backup, purging, and record removal), along with status information about cluster pollers is shown.

  • View Grid Process Status. Click to open the Grid Process Status page and show status information that is associated with cluster polling processes (for example, statistics for cluster poller, runtime, database maintenance, license collection).

  • Force Cacti Backup. Click to back up key Cacti and RTM database tables. For more information about backup and restore, see Maintaining Database.

  • Manage Grid Hosts. Click to open the Manage Hosts page and selectively remove client records from the host database.

  • Backup Files. Click a file name to download the backup file. The section is not displayed until RTM creates at least one backup file.