SQL: Window Functions
SQL: Window Functions
Introduction
Firstly we excute the following SQL statements and create a table Payroll in Postgresql:
1 | CREATE DATABASE WindowFunctions; |
Now we have a table payroll in the database windowfunctions:

Then we can execute the following SQL:
1 | SELECT |
And the output is:

Now we want to “combine” the result of the above SQL and SELECT * FROM Payroll, to get the following table:

The traditional way is to use JOIN:
1 | SELECT |
And a more simple way is to use window functions:
1 | SELECT |
That is why we use window functions, which can simplify our SQL statements.
Basic Syntax of Window Functions
We use an OVER clause to open the windows, and ( ) is the window, i.e., a set of rows. That is, if a function is followed by OVER(), we can treat it as a window function:
1 | window_function() OVER() --> () is the window |
Common Window Functions
Aggregate Functions
Aggregate functions, i.e., COUNT, SUM, MIN, MAX and AVG, can be used as window functions. For example:
1 | SELECT |

One thing worth noting is that windows are affected by the WHERE clause, that is, windows cannot work outside our base data set. For example:
1 | SELECT |

As shown, the total_employee_count is 7 rather than 18, which means that we firstly execute the statement SELECT * FROM Payroll WHERE department_name = 'IT', then open the windows, then ORDER BY employee_id.
ROW_NUMBER() & RANK() & DENSE_RANK()
ROW_NUMBER(), RANK(), and DENSE_RANK() functions assign an integer to each row based on its order in its result set.
ROW_NUMBER()
Here is the the first example of ROW_NUMBER():
1 | SELECT |

We can break the above statement into 3 steps:
SELECT * FROM Payroll- Open the window, i.e.,
ROW_NUMBER() OVER () AS "Base Row No" ORDER BY employee_id

Understanding the above executing procedures, we can see another ROW_NUMBER() example, which following the similar procedures:
1 | SELECT |

RANK() & DENSE_RANK()
For RANK(), duplicated values are assigned the same rank, skipping the next numebr in line:
1 | 1 1 |
While for DENSE_RANK(), duplicated values are assigned the same rank, and no values are skipped:
1 | 1 1 |
For example:
1 | SELECT |

There is one thing that we should notice, if we execute the following statement:
1 | SELECT |

As shown, we must use ORDER BY for ranks, otherwise each row’s rank would be 1.
Statistical Functions
percent_rank(): Relative rank of the current row, i.e.,
(rank-1)/(total rows - 1)cume_dist(): Relative rank of the current row, i.e.,
(rank)/(total rows)ntile(num_buckets integer): Returns integer ranging from 1 to the argument value, dividing the partition as equally as possible
For example:
1 | SELECT *, |

Offset Functions
Offset functions includes FIRST_VALUE(), LAST_VALUE(), LEAD() and LAG().
FIRST_VALUE() & LAST_VALUE()
For example:
1 | SELECT |

LEAD() & LAG()
For example:
1 | SELECT |
