Configuring cloud host monitoring for hours used

Configure cloud hosts that join your cluster to track the core-hours that is used by each host's cores. Known as core-hour usage (also known as variable use). You can query the total core-hours that are used in your clusters between two dates.

About this task

How core-hours are charged:
  1. Core-hours are charged for a host as long as this host is joined to the cluster, even if the host is not active. For example, for a host that is shut down and unavailable, core-hours will still be charged. There is one exception: the time that the host is unavailable, just prior to being removed from the cluster, is not charged. If a host is in OK state, then unavailable, then OK again, the unavailable time will be charged.

    If a host never joins a cluster (that is, it is always in an unavailable state), its usage will not be charged.

  2. Core-hours are rounded up to the next hour. If the host is joined to the cluster for 65 minutes, its cores will be charged for two hours.
Here are some examples of core-hour charging scenarios:
  • A one-core virtual server instance is in OK state for 60 minutes, and then in an unavailable state for five minutes before removal (that is, 65 minutes total). The calculation will correct for the unavailable time, so the usage will be 60 minutes. In this case, there will be no round-up, since the instance was up for exactly one hour, so only one core-hour will be charged.
  • A 32-core host is in OK state for five minutes and in an unavailable state for two minute before removal. The charge will be for five minutes and this will be rounded up to one hour. The total core-hours will be 32 core hours (32 cores * 1 hour).

Configuring your cloud hosts for core-hour usage involves tagging your cloud hosts with the corehoursaudit resource attribute. The corehoursaudit attribute marks cloud hosts as temporary and tracks the time that each host's cores are used. The corresponding metrics are then logged to the ego.cluster_name.entitlement.acct file.

If a cluster has spare perpetual license capacity, it is possible to allow hosts to use a mix of perpetual and variable use licenses. The EGO_LICENSE_CORE_ENTITLEMENT causes hosts that are tagged with the corehoursaudit resource attribute to count against the perpetual core entitlement until the entitlement limit defined by EGO_LICENSE_CORE_ENTITLEMENT is reached. If the EGO_LICENSE_CORE_ENTITLEMENT value is not configured, hosts configured with the corehoursaudit attribute do not add cores to the perpetual core entitlement. These hosts will log core-hours only. For more information about the EGO_LICENSE_CORE_ENTITLEMENT parameter, see ego.cluster_name.entitlement.acct file topic.

When you set the GPU entitlement file, the GPU socket hours information is logged. For more information about setting the GPU entitlement file, see Entitling IBM Spectrum Symphony GPU Harvesting.

Procedure

  1. Add the corehoursaudit resource attribute with the egoconfig addresourceattr command (see egoconfig). Optionally, add the cloudprovider resource attribute.
    egoconfig addresourceattr "[resource corehoursaudit]"

    Ensure that you use the egoconfig addresourceattr command to tag cloud hosts before the cloud host joins a cluster for the first time, either during image creation or during post-provisioning (recommended). If you are running the command from a post-provisioning script on Windows, include the -f option to suppress confirmation.

    Fast path: The egoconfig addresourceattr configuration is included in the sample host factory post-provisioning scripts that are bundled with IBM Spectrum Symphony. If you use the sample scripts, you don't need to use this command separately.
    Note: If a cloud host joined the cluster previously, without this attribute assigned, or to update this attribute, you must remove it from the cluster:
    egosh resource close -reclaim host_name
    egosh resource remove host_name
    Tip: After you enable the corehoursaudit resource attribute, you can run the following commands:
    • To find all hosts currently enabled to log core hours, you can run the following example command:
      egosh resource list -l -R corehoursaudit
      Sample output:
      NAME                   status        mem    swp    tmp   ut    it    pg   r1m  r15s  r15m  ls
      host1                OK            45G     0M   889G   4%   258   0.0   0.9   1.2   0.9   2
      For more information, see egosh resource.
    • To query the core usage information between two dates, you can run the following example command:
      egosh entitlement corehours -s "1/1/20" -e "1/5/20"
      Sample output:
      Core hours: 20
      GPU hours:   1
      For more information, see ego entitlement.

      To view more usage data, use the ego.cluster_name.entitlement.acct file to view a summary of core-hour usage. Use the cluster.hostusage file to view detailed core-hour usage.

    In addition, if you removed the tagged hosts from the cluster, you can display the total active and core usage information in the lim.loghostname file. This lim log file by default is available in the $EGO_LOCAL_CONFDIR/../log directory.
    Sample output:
    removeVULHost: Removed the variable use licensing host: host1. This host was active for 3.36 hours and logged 64 core hours.
  2. Optional: Change the defaults for core-hour usage parameters.
    1. Log in to the primary host.
    2. Edit the ego.conf file at $EGO_CONFDIR on Linux® and %EGO_CONFDIR% on Windows and update the following parameters:
      EGO_LICENSE_COREHOURS_MONITOR_INTERVAL_MINUTES
      Interval at which core-hour usage is monitored and logged to the cluster.hostusage file at %EGO_CONFDIR%\..\work\data\ on Windows and $EGO_CONFDIR/../work/data/ on Linux; default is 5 minutes. The cluster.hostusage file logs detailed core-hour metrics at a more frequent interval than that logged to the ego.cluster_name.entitlement.acct file and is also used for recovery purposes.
      EGO_LICENSE_CORE_ENTITLEMENT
      Maximum number of cores that can be entitled in a cluster. Default is 0, which indicates that all cloud hosts with the corehoursaudit resource attribute are counted only for core-hour usage. Define this parameter if you want to use a combination of perpetual licenses and variable use licenses for your cloud hosts. When defined, cloud hosts that join the cluster are licensed by cores, up to the specified entitlement. After all perpetual licenses are used, cloud hosts that join the cluster are licensed by core hours.
      ([7.3.2 Fix]EGO_LICENSE_COREHOURS_CONSOLIDATION
      Enables the collection of core-hour license usage information, that can later be consolidated with information from other clusters, to produce the usage of multiple clusters in one view. An administrator can then use this auditing information to calculate usage, and determine whether more licenses are required. Default is N to disable data collection and consolidation with other clusters.
      To collect and consolidate usage information from all clusters, follow this flow:
      1. Set EGO_LICENSE_COREHOURS_CONSOLIDATION=Y on each IBM Spectrum Symphony cluster that you want to consolidate usage information.
      2. Regularly (that is, monthly) collect the cluster.hostusage file from each cluster, and place the files into a single work directory. Note that the maximum number of days to retain core-hour usage records in the cluster.hostusage file is 35 days (that is, the data is purged every 35 days); therefore, ensure that you collect the cluster.hostusage file from each cluster at least every 35 days. Alternatively, to retain records for more than 35 days, increase the EGO_LICENSE_WORK_FILE_PURGE_DAYS parameter to a value greater than 35 (the default is 2 days).
      3. Run egolic commands to manage this information:
        • Run the egolic merge command to collect usage data for a specific cluster into a cluster.hostusage.cluster_name file. If the cluster.hostusage.cluster_name already exists, merges the new usage data with existing information.
        • Run the egolic report command to generate usage reports for all clusters. IBM Spectrum Symphony processes the data for the reports, using the values set in the $EGO_CONFDIR/egolic.json configuration file. (If the $EGO_CONFDIR location is not defined in the environment, the egolic.json will be in the current working directory.) This example egolic.json file shows the format of the file:
          {
              "logLevel": "LOG_INFO",
              "maxLogSizeMB": 100,
              "maxLogRotate":  10,
              "logDir": "${EGO_TOP}/kernel/log",
              "workDir": "${EGO_CONFDIR}/../work/data",
              "perpetualMaxCores": 1
          }
          
      EGO_LICENSE_GPUCORE_ENTITLEMENT
      Maximum number of GPU cores that can be entitled when a cluster is licensed for IBM Spectrum Symphony GPU Harvesting. Default is 0, which indicates that all cloud hosts with the corehoursaudit resource attribute are counted only for GPU core-hour usage. Define this parameter if you want to use a combination of GPU perpetual licenses and GPU variable use licenses for your cloud hosts. When defined, cloud hosts that join the cluster are licensed by GPU cores, up to the specified entitlement. After all GPU perpetual licenses are used, cloud hosts that join the cluster are licensed by GPU core hours.
      EGO_LICENSE_WORK_FILE_PURGE_DAYS
      Maximum number of days to retain core-hour usage records in the cluster.hostusage file; default is 2 days. Update this parameter when you want to retain the records for more than 2 days.
    3. Save your changes.
    4. Restart EGO:
      egosh ego restart

What to do next

Enable the HostFactory service; then, submit workload for cloud-enabled applications. See Starting host factory for cloud bursting. After workload is complete, core-hour usage metrics are logged to the ego.cluster_name.entitlement.acct file. See ego.cluster_name.entitlement.acct file.