OVER clause for OLAP window expressions
The OVER clause defines the result set on which an OLAP window expression is performed.
Syntax
OVER clause |--+-| OVER clause for numbering functions |---+----------------| +-| OVER clause for ranking functions |-----+ '-| OVER clause for aggregation functions |-' OVER clause for numbering functions |--OVER--(--+-----------------------------+--+-------------------------+--)--| '-| Window PARTITION clause |-' '-| Window ORDER clause |-' OVER clause for ranking functions |--OVER--(--+-----------------------------+--| Window ORDER clause |--)--| '-| Window PARTITION clause |-' OVER clause for aggregation functions |--OVER--(--+-----------------------------+---------------------> '-| Window PARTITION clause |-' >--+------------------------------------------------------+--)--| '-| Window ORDER clause |--+-------------------------+-' '-| Window Frame clause |-' Window PARTITION clause .-,-------------. V | |--PARTITION BY----partition_key-+------------------------------| Window ORDER clause .-,-------------------------------------------. V .-ASC--. | |--ORDER BY----sorting_key--+------+--+------------------+-+----| '-DESC-' '-NULLS--+-FIRST-+-' '-LAST--' Window Frame clause |--+-RANGE-+--+-+-UNBOUNDED PRECEDING-+----------------------------------------+--| '-ROWS--' | +-offset--PRECEDING---+ | | '-CURRENT ROW---------' | '-BETWEEN--+-UNBOUNDED PRECEDING-+--AND--+-UNBOUNDED FOLLOWING-+-' +-offset--PRECEDING---+ +-offset--PRECEDING---+ +-offset--FOLLOWING---+ +-offset--FOLLOWING---+ '-CURRENT ROW---------' '-CURRENT ROW---------'
Element | Description | Restrictions | Syntax |
---|---|---|---|
offset | Unsigned integer that represents the offset from the position of the current row | Cannot be negative. If zero, specifies the current row | Literal integer. |
partition_key | Column name, alias, or constant expression by which to partition rows | Must be in the select list of the Projection clause | Column Expressions |
sorting_key | Column name, alias, or constant expression by which to sort rows | Same restrictions as for partition_key. For RANGE window frames, only a single sorting key is allowed, and the data type must be numeric, DATE, or DATETIME. | Column Expressions |
If the OVER clause is empty, you must still include the empty parentheses.
Window PARTITION clause
An OLAP window partition is a subset of the rows that are returned by a query. Each partition is defined by one or more column expressions in the PARTITION BY specification of the OVER clause that defines the window. The database server applies the specified OLAP window function to all of the rows in each window partition. If no partitions are defined in the OVER clause, the window function is applied to every row in the result set of the query.
Window ORDER clause
The database server sorts the rows in each window partition according to the sort key (or multiple sort keys) in the window ORDER clause. If you specify no ascending (ASC) or descending (DESC) order, ASC is the default. If no ORDER clause is specified, the order of the qualifying rows is the order in which the rows were retrieved.
Window Frame clause
The window Frame clause can return subsets, called aggregation groups, of the rows in each window partition. A window frame is defined by a specific number of rows or by a range of values.
- Row-based window frame
- The
ROWS keyword creates a row-based window frame that consists of a specific
number of rows that precede or follow the current row, or both. The
offset represents the number of rows to return. The following example
returns seven rows that include the six rows that precede the current
row:
AVG(price) OVER (ORDER BY year, day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
- In a row-based window frame clause, offsets are expressed as unsigned integers because the keyword FOLLOWING specifies a positive offset from the current rows, and the keyword PRECEDING specifies a negative offset from the current row. The keyword UNBOUNDED refers to all of the rows from the current row to the limit of the window partition. As the first term after the ROWS keyword in a window Frame specification, UNBOUNDED PRECEDING means that the starting boundary is the first row in the partition, and UNBOUNDED FOLLOWING means that the ending boundary is the last row in the partition.
- Value-based window frame
- The RANGE keyword creates a value-based frame clause that consists
of the current row plus the rows that meet the criteria that is set
by the sorting key in the ORDER clause and fit into the specified
offset. The offset represents the number of units of the data type
of the sorting key. The sorting key must be a numeric, DATE, or DATETIME
data type. For example, if the sorting key is a DATE data type, the
offset represents a specific number of days. The following example
returns the count of the number of rows that have a ship date within
2 days of the current row plus the current row:
COUNT(*) OVER (ORDER BY ship_date RANGE BETWEEN 2 PRECEDING AND 2 FOLLOWING)
- Value-based window frames define rows within a window partition that contain a specified range of numeric values. The window ORDER clause of the OVER function defines the numeric, DATE, or DATETIME column to which the RANGE specification is applied, relative to the current row value for that column. Only a single sorting key is allowed in the ORDER clause of value-based window frames.
In both the row-based and value-based cases, the OLAP function is calculated on the contents of this window frame, rather than the fixed contents of the whole partition. The window frame does not need to contain the current row. For example, the following specification defines a window frame that contains only the row before the current row:
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
Example: SUM function without a window frame
The following query returns the sales by quarter for one year and the cumulative sum of the sales by quarter.
SELECT sales, SUM(sales) OVER (ORDER BY quarter)
FROM sales WHERE year = 2013
sales (sum)
120 120
135 255
127 382
153 535
The sum of the sales for the fourth quarter is equal to the sales in all four quarters.
Because the query does not include a window frame clause, the SUM function operates on the entire result set, as specified by the FROM clause.
Example: Row-based window frame
The following query returns players that are partitioned by teams and ordered by the number of points. Within each partition, the number of points for the player and the previous player are averaged:
SELECT team, player, points,
AVG(points) OVER(PARTITION BY team ORDER BY points
ROWS 1 PRECEDING AND CURRENT ROW) AS olap_avg
FROM points;
TEAM PLAYER POINTS OLAP_AVG
A Singh 7 7.00000000000
A Smith 14 10.50000000000
B Osaka 8 8.00000000000
B Ricci 12 10.00000000000
B Baxter 18 15.00000000000
C Chun 13 13.00000000000
D Kwan 9 9.00000000000
D Tran 16 12.50000000000
Example: Range-based window frame
The following query returns players that are partitioned by teams and ordered by age. Within each partition, the number of points for each player and any player who is up to 9 years older is averaged:
SELECT player, age, team, points,
AVG(points) OVER(PARTITION BY team ORDER BY age
RANGE BETWEEN CURRENT ROW AND 9 FOLLOWING) AS olap_avg
FROM points_age;
PLAYER AGE TEAM POINTS OLAP_AVG
Singh 25 A 7 10.50000000000
Smith 26 A 14 14.00000000000
Baxter 27 B 18 13.00000000000
Osaka 35 B 8 10.00000000000
Ricci 40 B 12 12.00000000000
Chun 21 C 13 13.00000000000
Kwan 22 D 9 12.50000000000
Tran 31 D 16 16.00000000000
In partition A, the average for Singh includes the points for Smith, because Smith is one year older than Singh. The average for Smith does not include the points from Singh, because Singh is younger than Smith.
In partition B, the average for Baxter includes the points for Osaka, who is 8 years older than Baxter, but not for Ricci, who is 13 years older than Baxter.
In partition D, the average for Kwan includes the points for Tran, because Tran is 9 years older than Kwan.
Example: Window frame without the current row
The following query calculates the average number of points for the preceding two rows in the partition:
SELECT player, age, team, points,
AVG(points) OVER(PARTITION BY team ORDER BY age
ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING) AS olap_avg
FROM points_age;
PLAYER AGE TEAM POINTS OLAP_AVG
Singh 25 A 7 NULL
Smith 26 A 14 7.00000000000
Baxter 27 B 18 NULL
Osaka 35 B 8 18.00000000000
Ricci 40 B 12 13.00000000000
Chun 21 C 13 NULL
Kwan 22 D 9 NULL
Tran 31 D 16 9.00000000000
In partition B, the average for Ricci is based on the points totals for Baxter and Osaka: (18 + 8 = 26)/2 = 13. When the current row has no preceding rows to use for the calculation, the result is NULL.