z/OS - Group home

Dynamic Resource Management with KVM for IBM z Systems and LinuxONE

  

Delivered with LinuxONE™ and KVM for IBM z Systems is a new management tool that protects capacity for your most critical work. There is no longer a need to run your hypervisor at low utilization to protect your most important work, while dedicating valuable resources that 'might' be needed one day. Resource management can now be automated with KVM for IBM z Systems, where your performance goals and business importance levels drive resource allocation.

 

Overview

Hypervisor Performance Manager on KVM for IBM z Systems (zHPM) manages virtual server resources, at a high utilization rate, across a collection of virtual servers. This real-time resource re-balancing works to protect the most important work isn't delayed by resource contention. 

zHPM provides the following base functionality:

  • Monitoring capabilities, which allow an administrator to understand if specified performance goals are being achieved.
  • Dynamic management of physical CPU resources allocated to active virtual servers. 

 

Dynamic resource management is performed by grouping together virtual servers that support an overall business function (e.g. a multi-tier application).

  • This grouping of virtual servers is known as a workload resource group. Each workload resource group defines a performance policy. 
  • A performance policy assigns a business importance to its workload resource group, and contains a set of service classes.
  • Service classes define performance objectives for virtual servers.  These are based on how much CPU delay (caused by resource contention) the virtual server can tolerate.
  • Business importance defines the priority for achieving performance objectives.
  • Virtual servers within a workload resource group are assigned to service classes using classification filters.

Resources will be transferred between workload resource groups to aid virtual servers in reaching their performance objectives. Groups with lower importance will have their resources moved to ones with higher importance.

 

Example:

 

Three workload resource groups have been defined, and are named 'Gold', 'Silver', and 'Bronze'. The most important business functions run on virtual servers defined to the 'Gold' workload resource group, with a business importance value of "Highest". Less important business functions run on virtual servers defined to the 'Silver' workload resource group, with a business importance value of "Medium". The least important business functions run on virtual servers defined to the 'Bronze' workload resource group, with a business importance value of "Low".

 

Each virtual server is additionally classified by a service class where each tier of the business function has a performance goal. If virtual server VM1 in the 'Gold' workload resource group experiences delay due to resource contention, it will acquire additional CPU resources from virtual servers in the 'Bronze' workload resource group. If these resources are insufficient, it will then move on to the 'Silver' workload resource group. This will continue until either the 'Gold' workload resource group reaches its performance objective, or virtual servers in lower business importance workloads reach a floor of CPU resources. zHPM helps workloads to reach their performance objectives by starting with the most important work first.

image
 
 
 

Configuration of zHPM on the Hypervisor

Add users to the right group

Before diving in and using the zHPM APIs or command-line interface, a zHPM user must be setup. There are two zHPM user groups, zhpmuser, and zhpmadm. A user that wants to interact with zHPM must be a member of one of these groups. 

 

The zhpmuser group provides read privileges, allowing a user to perform GET requests with the API or issue display commands from the command-line client. The zhpmadm group provides both read and write privileges with the zHPM API and command-line client. The user should be added to the appropriate group based on the desired role. This can be accomplished via the usermod command. In the following example, the user martian is added to the zhpmadm group:

sudo usermod -a -G zhpmadm martian

 

Managing the zhpmd service 

Once a user has been configured, that status of the zHPM service (zhpmd) can be checked via the following command:

sudo systemctl status zhpmd

 

If the zHPM service is not running, it can be started with the following command:

sudo systemctl start zhpmd

 

To have the zHPM service start automatically during system boot, run the following command:

sudo systemctl enable zhpmd

 

Using zHPM

Create a new group of virtual server resources

Before adding virtual servers to a workload resource group, one has to be created. This can be done with either the API or command-line interface.

 

The following command creates a new workload resource group:

zhpm wrg-create --wrg-name "Gold" 

 

Update the performance policy of workload resource group

Defining the business importance of the workload resource group is done within the performance policy. Workload resource groups with higher importance and goal will experience lower levels of CPU delay, allowing them to achieve their expected performance. For the newly created "Gold" workload resource group let's configure a performance policy it to expect low CPU delay or "highest" performance. 

 

Performance policies are defined as a JSON object.{
    "performance-policy": {
        "perf-policy-info": {
            "name": "Gold",
            "description": "Most important business applications",
            "business-importance": "highest"
        },
        "service-classes": [{
            "name": "HTTP",
            "description": "HTTP Service Class",
            "business-importance": "highest",
            "velocity-goal": "fast",
            "cpu-critical": false,
            "virtual-server-name-filters": [
                "VM1", "VM2"
            ]
        }, {
            "name": "AppServer",
            "description": "Most Important Application Server",
            "business-importance": "highest",
            "velocity-goal": "fast",
            "cpu-critical": false,
            "virtual-server-name-filters": [
                "VM3"
            ]
        }, {
            "name": "Database",
            "description": "Most Important Database",
            "business-importance": "highest",
            "velocity-goal": "fastest",
            "cpu-critical": false,
            "virtual-server-name-filters": [
                "VM4"
            ]
        }]
      }
    }

 

Save this JSON object in a file named gold_policy.json . The following command updates the "Gold" workload resource group with the specified performance policy setting a business importance of "highest".

zhpm wrg-update --wrg-name "Gold" --perf-policy-file gold_policy.json

 

Add virtual servers to workload resource group

Virtual servers are automatically discovered by the zHPM service, and defined within a default workload resource group. To list all the known virtual servers, issue the following command:

zhpm vs-display

 

For each virtual server you want to add to a workload resource group use the following command:

zhpm vs-wrg-add --wrg-name "Gold" --vs-name "VS1"

 

 

Enable CPU Management

With a performance policy in place, and virtual servers added to a workload resource group, the system is ready to actively manage resource allocation. The next step is to enable CPU management based on the defined policy with the following command:

zhpm config --cpu-mgmt on

 

This will enable active resource management, monitoring the velocity of virtual servers. 

 

Monitor performance of workload resource groups

There are numerous metrics available though the zHPM APIs. For the workload resource group, performance index (PI) is the metric used to determine if a collection of virtual servers are meeting the defined performance goal. This PI is a measure of the average performance calculated over the last 15 second interval. Actual performance and goal performance is displayed with metrics, which are displayed as "fastest", "fast", "moderate", "slow", or "slowest".  This is the easiest way to understand if resource adjustments are protecting the highest priority work is achieving performance goals even with a system that is 100% utilized. To view these metrics, use the following command:

zhpm metrics --wrg-name "Gold"

Wrg-Id                               Sc-Name   Pi Act-Perf Cpu-UT Cpu-DT Goal    
------------------------------------ -------- --- -------- ------ ------ --------
dfb2cba1-8750-4285-a401-f2b6e58649bf Database 0.6 fastest      86     0  fastest

 

 

image

Additional Info

For information on other zHPM commands consult the manpages (man zhpm) or see the official documentation

 

The steps outlined above are enough to get started with automated resource management on KVM for z Systems. For the curious there are many other zHPM features, metrics and APIs to better control and understand your infrastructure.