SQL Reference

Window Aggregate Functions

Overview

Window functions operate on a set of rows and return a single value for each row from the underlying query. Like an aggregate function, a window function operates on a set of rows, but it does not reduce the number of rows returned by the query. Unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row – the rows retain their separate identities and the window function is able to access more than just the current row in the query result. The term window describes the set of rows on which the window function operates. A window function returns a value from the rows in a window.

The OVER() and PARTITION BY clauses are used to define the scope of the window.

Syntax

This is the general syntax of the window function call is:

SQL


Arguments

<PARTITION BY value_expression> Defines a window or user-specified set of rows within a result set. It does not reduce the number of rows in the result set like GROUP BY can. The value_expression specifies the column by which the rows are partitioned.

<ORDER BY order_by_expression [ASC | DESC] [NULLS FIRST | NULLS LAST], …> Defines the logical order of the rows within each partition of the result set. The order_by_expression specifies a column or expression on which to sort.

ASC | DESC Specifies that the values in the specified column should be sorted in ascending or descending order. ASC is the default sort order.

NULLS FIRST | NULLS LAST Specifies if NULLs should be sorted first or last. The default is last.

<ROWS or RANGE clause> Further limits the rows within the partition by specifying start and end points within the partition. This is done by specifying a range of rows with respect to the current row either by logical association or physical association. Physical association is achieved by using the ROWS clause.

The ROWS clause limits the rows within a partition by specifying a fixed number of rows preceding or following the current row. Alternatively, the RANGE clause logically limits the rows within a partition by specifying a range of values with respect to the value in the current row. Preceding and following rows are defined based on the ordering in the ORDER BY clause. The window frame “RANGE … CURRENT ROW …” includes all rows that have the same values in the ORDER BY expression as the current row. For example, ROWS BETWEEN 2 PRECEDING AND CURRENT ROW means that the window of rows that the function operates on is three rows in size, starting with 2 rows preceding until and including the current row.

SQL


ROW/RANGE Arguments

UNBOUNDED PRECEDING Specifies that the window starts at the first row of the partition. UNBOUNDED PRECEDING can only be specified as window starting point.

row_integer PRECEDING Specified with <unsigned integer literal> to indicate the number of rows or values to precede the current row. This specification is not allowed for RANGE.

CURRENT ROW Specifies that the window starts or ends at the current row when used with ROWS or the current value when used with RANGE. CURRENT ROW can be specified as both a starting and ending point.

BETWEEN <window_frame_bound> AND <window_frame_bound> Used with either ROWS or RANGE to specify the lower (starting) and upper (ending) boundary points of the window. The first <window_frame_bound> defines the boundary starting point and the subsequent <window_frame_bound> defines the boundary end point. The upper bound cannot be smaller than the lower bound.

UNBOUNDED FOLLOWING Specifies that the window ends at the last row of the partition. UNBOUNDED FOLLOWING can only be specified as a window end point. For example RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING defines a window that starts with the current row and ends with the last row of the partition.

row_integer FOLLOWING Specified with <unsigned value specification> to indicate the number of rows or values to follow the current row. When <unsigned value specification> FOLLOWING is specified as the window starting point, the ending point must be <unsigned value specification> FOLLOWING. For example, ROWS BETWEEN 2 FOLLOWING AND 10 FOLLOWING defines a window that starts with the second row that follows the current row and ends with the tenth row that follows the current row. This specification is not allowed for RANGE.

ROWS vs RANGE The difference is that RANGE always includes all rows with equal values on the ORDER BY keys even if they are outside the range, whereas ROWS does not.

Supported Window Aggregate Functions

These are the supported window functions in addition to those listed under V4 Aggregate Functions which can also be used as window functions. The DISTINCT keyword cannot be used with window aggregation functions.

CUME_DIST 

Returns a number 0 < n ⇐1 and can be used to calculate the percentage of values less than or equal to the current value in the group 

Syntax

SQL


DELTA

Computes the finite difference between successive values of expression under the specified ordering. This is a backwards difference, which means that, at degree one, the value for a given row is the difference between the value of expression for that row and the previous row. 

The expression values have the following requirements:

  • Expressions must be numeric types. 
  • The optional degree argument can be any floating point number with a default value of one. If you set the degree argument to a positive integer that is greater than one, the window function calculates the higher order finite difference without using nested DELTA calls. If the degree argument is a negative integer, the calculation is an anti-difference (i.e. discrete integration). You can specify fractional values, either positive or negative, and these values mean fractional differences or sums. 

The function requires two rows of input data to calculate a first degree difference, three rows of input data to calculate a second degree difference, and so on. However, rather than returning NULL for the first few rows, the function returns the correct values such that using DELTA of degree n followed by DELTA of degree -n cancel each other out to return the initial result (other than floating point error accumulation). In general, when you use this window function over larger result sets with larger degrees (the absolute value of the degree matters), the floating point error accumulation worsens. Integer degree values in the range [-3, 3] are generally stable up to approximately 5 million rows. Fractional degree values in the range [-3, 3] are generally stable up to approximately 1 million rows. The stable degree range is wider over smaller result sets and narrower over larger result sets. For example, fractional and integer degrees are stable over the range [-4, 4] up to approximately 80,000 rows.

The function treats any NULL values in the input data as zero, which might cause unexpected results. Filter out NULL values before you use the window function.

Syntax

SQL


Example:

SQL


DENSE_RANK

Assigns a number to each row in the result set with equal values having the same number. There will be no gaps between ranks. 

Syntax

SQL


DERIVATIVE

Computes the difference quotient between successive values of expression with respect to expression2. If expression is a sampled time-series variable, then this calculation is equivalent to the derivative of expression with respect to expression2 (i.e. the DELTA of expression divided by the DELTA of expression2). This is a backwards difference, which means that, at degree one, the calculation uses the value for the current and previous row. Both expression arguments must be numeric types. 

For higher order calls, the sample points for expression must be equally spaced or the results are incorrect. In other words, the DELTA calls between consecutive expression2 values must always be the same. If this is not true, use some form of interpolation to create an evenly spaced series of samples before you use the window function. 

The optional degree argument can be any floating point number with a default value of one. If you set the degree argument to a positive integer that is greater than one, the window function calculates the higher order difference quotients (discrete derivatives) without using nested DERIVATIVE calls. If the degree argument is a negative integer, the calculation is a discrete anti-differentiation or integration. You can specify fractional values, either positive or negative, and these values mean fractional differentiation or integration. 

The function requires two rows of input data to calculate a first degree derivative, three rows of input data to calculate a second degree derivative, and so on. However, rather than returning NULL for the first few rows, the function returns the correct values such that using DERIVATIVE of degree n followed by DERIVATIVE of degree -n cancel each other out to return the initial result (other than floating point error accumulation). In general, when you use this window function over larger result sets with larger degrees (the absolute value of the degree matters), the floating point error accumulation worsens. Integer degree values in the range [-3, 3] are generally stable up to approximately 5 million rows. Fractional degree values in the range [-3, 3] are generally stable up to approximately 1 million rows. The stable degree range is wider over smaller result sets and narrower over larger result sets. For example, fractional and integer degrees are stable over the range [-4, 4] up to approximately 80,000. 

The function treats any NULL values in the input data for either expression or expression2 arguments as zero, which might cause unexpected results. Filter out NULL values before you use the window function. 

 Syntax

SQL


Example:

SQL


FIRST_VALUE

Returns the first value in the ordered result set. 

Syntax

SQL


LAG

Returns the row which is the specified number backward from the current row. Default is 1 if offset is omitted. 

Syntax

SQL


LAST_VALUE

Returns the last value in the ordered result set. 

Syntax

SQL


LEAD

Returns the row which is the specified number forward from the current row. Default is 1 if offset is omitted. 

Syntax

SQL


NTH_VALUE

Returns the nth value in the ordered result set. 

Syntax

SQL


PERCENTILE

Returns the value that corresponds to the specified percentile (0 ≤ n ≤ 1) within the group. The set of values is treated as a continuous distribution, so it is possible that the computed value cannot appear in the result set. 

Syntax

SQL


PERCENT_RANK

The value returned is 0 < n ≤ 1 and can be used to calculate the percentage of values less than the current group, excluding the highest value. The highest value in a group will always be 1. 

Syntax

SQL


RANK

Assigns a number to each row in the result set with equal values having the same number. There can be gaps between ranks. Similar to the ranking used in sporting events. 

Syntax

SQL


RATIO_TO_REPORT

Computes the ratio of a value to the sum of the set of values. For example, RATIO_TO_REPORT can calculate the ratio of an employee’s salary relative to the total salary in a department.  

Syntax

SQL


ROW_NUMBER

Assigns a unique number to each row in the result set. 

Syntax

SQL


ZSCORE

Zscore of the sample based on the stddev() function 

Syntax

SQL


ZSCOREP

Zscore of the the sample based on the stddevp() function 

Syntax

SQL