GitHubContribute in GitHub: Edit online

hll() (aggregation function)

The hll() function is a way to estimate the number of unique values in a set of values. It does this by calculating intermediate results for aggregation within the summarize operator for a group of data using the dcount function.

Read about the underlying algorithm (HyperLogLog) and the estimation accuracy.

See data-explorer-agg-function-summarize-note

Use the hll_merge function to merge the results of multiple hll() functions. Use the dcount_hll function to calculate the number of distinct values from the output of the hll() or hll_merge functions.

Syntax

hll (expr [, accuracy])

Parameters

Name Type Required Description
expr string The expression used for the aggregation calculation.
accuracy int The value that controls the balance between speed and accuracy. If unspecified, the default value is 1. For supported values, see Estimation accuracy.

Returns

Returns the intermediate results of distinct count of expr across the group.

Example

In the following example, the hll() function is used to estimate the number of unique Data Source values of the data_source_name column within each 10-minute time bin of the original_time column.

events
print hll(data_source_name) by bin(original_time,10m)
| take 1

The results table shown includes only the first 1 row

Results

original_time data_source_name data_source_type_name name user_id low_level_categories src_ip src_port dst_ip dst_port severity event_uuid payload
1682461679682 microsoftWindowsSource6 Microsoft Windows Security Event Log Process Create [8110] 0.0.0.0 0.0.0.0 2 2b02dd50-241e-41cf-9257-1febd36c0140 <13>Feb 10 13:53:35 microsoftWindowsSource6 AgentDevice=WindowsLog AgentLogFile=Microsoft-Windows-Sysmon/Operational...