Querying performance data by using /perfmon/data request

The GET perfmon/data request gets performance details from the cluster with the help of queries. A query is written in the performance monitoring tool query language format.

The types of queries that can be used to get the performance details are as follows:

Query by metric specification
Query by key specification
Query by group specification

Query by metric specification

A metric specification is a list of metrics and operation on metrics that are separated by commas. For example, metrics cpu_idle,cpu_system and sum(cpu_user)

Synopsis: metrics metric1,... [key_filter_specs] time_filter [grouping_spec] bucket_spec

Metric names are based on the types of sensors that are available in the performance monitoring tool. For more information on the available performance metrics, see List of performance metrics . A query can contain metrics from different sensors. You can perform the following operations on the performance data that is retrieved by the query:

sum
avg
max
min
rate

For example, sum (netdev_bytes_r) returns the sum of all netdev_bytes_r values covered by the query. The rate operation returns a rate of the data that is retrieved by the query. For example, rate (netdev_bytes_r) returns bytes received per second. You can vary the size of the bucket.

Query by key specification

You can use various keys that identify the requested metrics. A key is a set of metric names that are separated by the “|” character. Keys are based on available sensor instances in the performance monitoring tool setup. For more information on how to find out the key names by using the mmperfmon query, see mmperfmon command.

Synopsis: key key1,... [key_filter_specs] time_filter [grouping_spec] bucket_spec

Regular expressions can be used in a key to increase the range of data returned. For example, key device[1-4]|CPU|cpu_idle.

Query by group specification

The group query retrieves metric data by using buckets of given time length, for a given time period, and filtered keys. Optionally, grouping can be done according to the similarities in the keys. For example, you can specify a sensor name to return all metrics for that group of sensors.

Synopsis: group sensor [key_filter_specs] time_filter [grouping_spec] bucket_spec

The following arguments are mandatory in a query by group:

get_list: Specification of a list of requested metrics, keys, or a sensor.
time_filter: Time span of the request.
bucket_spec: Time granularity of the returned data.

Using filters and grouping in a query

You can use various filters and grouping techniques to get the exact details that you are looking for.

Key filter specification

Each metric can contain a key filter specification that can be used to narrow down the results that are returned for the metrics specified in the query. If the query is specified by a key specification, the key filter specification is ignored.

The key filter specification consists of the key word “from”, followed by a sequence of assignments of the form metric_name = string_value. These are separated by commas. The metric_name must be an identifier of metric type SEMTYPE_IDENTIFIER. The commas imply a logical AND operation of the assignments and these assignments apply to all keys where the metric_name is used. The string_value can be a regular expression, permitting additional filtering of values returned by the query. Such a key filter specification is given as follows:

... from netdev_name=eth[01] ...

For example, the metric cpu_user has the following key structure: node|sensor|metric_name. In this case, an instance of cpu_user can have the key server1|CPU|cpu_user. If the key_filter_specs is omitted, then the values for the metric cpu_user is searched by using the key containing wildcards. For example, *|*|cpu_user. This results in returning values from all reporting CPU sensors. If the key_filter_specs is present, for example, from node=server1, then only the value from the sensor on server1 is returned (key is server1|*|cpu_user).

Time filter specification

Each query must contain a time period for which data is to be returned. The time period for the query can be specified as a time span such as last n buckets, last t seconds, and the current value. The last ‘n’ filter returns the last ‘n’ buckets of metric data. The duration ‘t’ filter returns a list of buckets of metric data that covers last ‘t’ seconds of time. The ‘now’ filter returns the current value (last bucket) for given metrics.

Only one of these types can be used at a time.

Time span consists of a starting time ( tstart ) and an end time ( tend ), both expressed either as the “unix time” in seconds or a combination of date and time specification. For example, “2012-11-10 13:00:00”.

If only tstart is specified, then the query covers all data from that time until now. If only tend is specified, a default number of buckets up to the specified time is returned. The default number of buckets is 12.

The time values specified as part of the filter specification can be rounded to fit the bucket sizes that are used internally to store the metrics.

Grouping specification

A grouping is used to split results of metric operations based on specified key metric names. It consists of the keyword group_by followed by a comma-separated list of metric names. Grouping is done based on the key value that is associated with the metric name specified in the grouping specification. A metric name in the key_filter_spec must be of type SEMTYPE_IDENTIFIER.

An example for grouping: If the metric operation is sum(netdev_bytes_r) and the grouping_key is netdev_name. Then, the result will be a list of sums of values of netdev_bytes_r. For example, eth0, eth1, and lo0 – 3 sums.

Bucket specification

A query must contain a bucket specification. It indicates the time interval in seconds to which the data is accumulated to. For example, specifying bucket_size 30 returns the data accumulated in 30-second intervals. Depending on the type of metric, metrics are accumulated in different ways. For example, averaged for averages and summed up for counters.

Query samples

metrics cpu_user from node=anaphera-dev2 last 10 bucket_size 1

Gets metric cpu_user for the instance where the key element node equals anaphora-dev2 and return the last 10 buckets of a 1-second size.

metrics cpu_user, cpu_system, cpu_idle from node=anaphera-dev2

tstart 2012-11-10 08:00:00 tend 2012-11-10 18:00:00 bucket_size 30

Gets metrics cpu_user, cpu_system and cpu_idle for node anaphora-dev3 from given start to end time, using a bucket size of 30 seconds.

metrics sum(cpu_user), sum(netdev_bytes_r) last 30 bucket_size 10

Gets the sum of cpu_user and netdev_bytes_r metrics for all instances and the last 30 buckets by using a bucket size of 10 seconds.

metrics sum(netdev_bytes_r) last 50 group_by netdev_name bucket_size 1

Gets sum of netdev_bytes_r (last 50 buckets of size 1 second), grouped by key attribute netdev_name.

key anaphera-dev2|CPU|cpu_user last 10 bucket_size 1

Gets metric cpu_user for the instance specified by the explicit key by using bucket size equal to 1, for 10 seconds (result will contain 10 buckets)

metrics cpu_user duration 210 bucket_size 100

Gets metric cpu_user for all nodes and a duration that covers last 210 seconds of time using a bucket size 100. Therefore, the result consists of three buckets.

group CPU last 30 bucket_size 1

Gets all metrics of CPU sensor for all nodes by using bucket size 1 for last 10 buckets (equal to last 10 seconds).