# summarize operator

## Syntax

*T* `| summarize`

[*SummarizeParameters*] [[
*Column* `=`

] *Aggregation* [`,`

...]] [
`by`

[
*Column* `=`

] *GroupExpression* [`,`

...]]

## Arguments

*Column:*Optional name for a result column. Defaults to a name derived from the expression.*Aggregation:*A call to an aggregation function such as`count()`

or`avg()`

, with column names as arguments. See the list of aggregation functions.*GroupExpression:*A scalar expression that can reference the input data. The output will have as many records as there are distinct values of all the group expressions.*SummarizeParameters*: Zero or more (space-separated) parameters in the form of*Name*`=`

*Value*that control the behavior. The following parameters are supported:

#### Note

When the input table is empty, the output depends on whether *GroupExpression* is used:

- If
*GroupExpression*is not provided, the output will be a single (empty) row. - If
*GroupExpression*is provided, the output will have no rows.

## Returns

The input rows are arranged into groups having the same values of the `by`

expressions. Then the specified aggregation functions are computed over each group, producing a row for each group. The result contains the `by`

columns and also at least one column for each computed aggregate. (Some aggregation functions return multiple columns.)

The result has as many rows as there are distinct combinations of `by`

values (which may be zero). If there are no group keys provided, the result has a single record.

To summarize over ranges of numeric values, use `bin()`

to reduce ranges to discrete values.

#### Note

- Although you can provide arbitrary expressions for both the aggregation and grouping expressions, it's more efficient to use simple column names, or apply
`bin()`

to a numeric column.

## List of aggregation functions

Function | Description |
---|---|

arg_max() | Returns one or more expressions when the argument is maximized |

arg_min() | Returns one or more expressions when the argument is minimized |

avg() | Returns an average value across the group |

avgif() | Returns an average value across the group (with predicate) |

binary_all_and | Returns aggregated value using the binary `AND` of the group |

binary_all_or | Returns aggregated value using the binary `OR` of the group |

binary_all_xor | Returns aggregated value using the binary `XOR` of the group |

count() | Returns a count of the group |

countif() | Returns a count with the predicate of the group |

dcount() | Returns an approximate distinct count of the group elements |

dcountif() | Returns an approximate distinct count of the group elements (with predicate) |

make_list() | Returns a list of all the values within the group |

make_list_if() | Returns a list of all the values within the group (with predicate) |

make_set() | Returns a set of distinct values within the group |

make_set_if() | Returns a set of distinct values within the group (with predicate) |

max() | Returns the maximum value across the group |

maxif() | Returns the maximum value across the group (with predicate) |

min() | Returns the minimum value across the group |

minif() | Returns the minimum value across the group (with predicate) |

sum() | Returns the sum of the elements within the group |

sumif() | Returns the sum of the elements within the group (with predicate) |

take_any() | Returns a random non-empty value for the group |

take_anyif() | Returns a random non-empty value for the group (with predicate) |

## Aggregates default values

The following table summarizes the default values of aggregations:

Operator | Default value |
---|---|

`count()` , `countif()` , `dcount()` , `dcountif()` |
0 |

`make_bag()` , `make_bag_if()` , `make_list()` , `make_list_if()` , `make_set()` , `make_set_if()` |
empty dynamic array ([]) |

All others | null |

When using these aggregates over entities which includes null values, the null values will be ignored and won't participate in the calculation (see examples below).

## Examples

This example simply counts all the events in the `events`

table.

```
events
| project original_time
| where original_time > ago(24h)
| summarize NumEvents=count()
```

### Results

`NumEvents` |
---|

`4163038` |

### Next Example

This example returns the total count of events for the last 24 hours, binned into 24hr periods.

```
events
| project original_time
| where original_time > ago(24h)
| summarize NumEvents=count() by HourBucket=bin(original_time, 60m)
| sort by HourBucket desc
| take 5
```

### Results

`HourBucket` |
`NumEvents` |
---|---|

`2022-06-12 23:00:00.000` |
`8374746` |

`2022-06-12 22:00:00.000` |
`8214746` |

`2022-06-12 21:00:00.000` |
`7374746` |

`2022-06-12 20:00:00.000` |
`9938733` |

`2022-06-12 19:00:00.000` |
`0` |

`2022-06-12 18:00:00.000` |
`1` |

A table that shows how many events were ingested in each one hour interval for the last 24 hours. The table will contain two columns, `EventCount`

which is the count of the events for the hour, and `HourBucket`

is binned
time window of aggregation for the events.

### Unique combination

Determine what unique combinations of event
`name`

and `severity`

there are in a table. There are no aggregation functions, just group-by keys. The output will just show the columns for those results:

```
events
| project severity, name, original_time
| where original_time > ago(24h)
| where severity > 6
| summarize by name, severity
| sort by name asc, severity desc
| take 5
```

### Results

`name` |
`severity` |
---|---|

`(Primary) Failover cable OK` |
`10` |

`(Primary) Failover cable Not OK` |
`6` |

`(Primary) Failover message block alloc failed` |
`7` |

`30419 Internet Explorer 8 XSS Attack` |
`8` |

`A fatal alert was generated and sent to the remote endpoint` |
`7` |

### Minimum and maximum timestamp

Finds the minimum and maximum timestamp of all events in the `events`

table. There is no group-by clause, so there is just one row in the output:

```
events
| project severity, original_time
| where original_time > ago(24h)
| where severity > 6
| summarize MinTime=min(original_time), MaxTime=max(original_time)
```

### Results

`Min` |
`Max` |
---|---|

`1975-06-09 09:21:45` |
`2015-12-24 23:45:00` |

### Distinct Count

Create a row for each `name`

, showing a count of distinct `src_ip`

the events ingested.

```
events
| project name, severity, original_time, src_ip
| where original_time > ago(24h)
| where severity > 6 and isnotempty(name) and isnotempty(src_ip)
| summarize SrcIpCount=count_distinct(src_ip) by Name=name
| order by SrcIpCount desc
| take 5
```

### Results

Name | SrcIpCount |
---|---|

The processing of Group Policy failed | 385 |

Content Protection Violation | 367 |

Web Server Enforcement Violation | 312 |

Openfire Jabber server authentication bypass | 305 |

Deny protocol reverse path check | 290 |