funciones-ventana-row-number-rank

Window Functions ROW_NUMBER, RANK and DENSE_RANK

  • 5 min

Window functions allow you to assign sequential numbers or ranking positions to the rows of a query without collapsing the original results.

Until now, when we wanted to calculate aggregates (like sums or counts), we had to use GROUP BY. But that came at a cost: we lost the detail of the individual rows. Everything was collapsed into a summary.

What if I want to know the “Top 3 sales” but without losing the information of who made the sale and when? Or if I want to number the rows of a query?

This is where window functions come into play. In this article, we will focus on Ranking functions: ROW_NUMBER, RANK and DENSE_RANK.

What is a Window Function?

A window function performs a calculation across a set of rows that are related to the current row.

Unlike GROUP BY, it does not group the rows into a single one. The rows keep their identity, but an extra column is added with the result of the calculation.

The OVER Clause

The OVER clause defines a window of data over which an aggregate function will be applied. This window can be the entire table, a subset of rows, or even rows related to the current row.

The basic syntax is as follows:

AGGREGATE_FUNCTION() OVER ( [PARTITION BY column] [ORDER BY column] )
Copied!
  • AGGREGATE_FUNCTION: Can be any aggregate function, such as SUM, AVG, COUNT, etc.
  • PARTITION BY: Divides the data into groups (partitions) based on one or more columns.
  • ORDER BY: Orders the rows within each partition.

The Three OVER Companions

Imagine we took a class exam and have the following grades. Let’s see how each function scores.

SalespersonAmount
Luis€1000
Ana€800
Pedro€800
Sofía€500

Notice that Ana and Pedro are tied with €800. This is where the differences arise.

As the name suggests, it assigns a sequential and unique row number.

  • It simply counts: 1, 2, 3, 4…
  • It doesn’t care about ties. If two rows have the same value in the ORDER BY, the engine will arbitrarily decide who goes first (unless you add more tie-breaking criteria).

Main use: Pagination and removing duplicates.

Calculates the ranking in “Olympic competition” style.

  • If there is a tie, both receive the same position.
  • Leaves gaps in the subsequent numbering.

In our example: Ana (1), Beto (2), Carla (2)… Dani will be 4th. (It skips 3 because there are two people ahead of him tied).

Calculates the ranking without leaving gaps.

  • If there is a tie, both receive the same position.
  • Does NOT leave gaps. The next position is the immediate number.

In our example: Ana (1), Beto (2), Carla (2)… Dani will be 3rd.

Code Example

Let’s see it with a real query on a Sales table.

SELECT 
    Salesperson,
    Amount,
    ROW_NUMBER() OVER (ORDER BY Amount DESC) as Row,
    RANK()       OVER (ORDER BY Amount DESC) as Ranking,
    DENSE_RANK() OVER (ORDER BY Amount DESC) as DenseRanking
FROM Sales;
Copied!

Result:

SalespersonAmountRow (Row_Number)Ranking (Rank)DenseRanking
Luis€1000111
Ana€800222
Pedro€800322
Sofía€500443
  • Row: Pedro is 3 because yes (or because alphabetically he comes after Ana, but it’s sequential).
  • Rank: Pedro ties with Ana (2). Sofía jumps to 4 (3 is skipped).
  • Dense: Pedro ties with Ana (2). Sofía is 3rd (numbers are not skipped).

The Power of PARTITION BY

The PARTITION BY clause is optional and is used inside OVER to divide the data into smaller groups, called partitions.

Each partition is treated as an independent set of data, and the aggregate function is applied to each of these groups.

This allows us to reset the ranking for each data group.

For example, we want to know who the top 2 salespeople are in each department.

SELECT * FROM (
    SELECT 
        Department,
        Salesperson,
        Sales,
        DENSE_RANK() OVER (
            PARTITION BY Department -- Resets the counter when Department changes
            ORDER BY Sales DESC
        ) as Position
    FROM Employees
) DerivedTable
WHERE Position <= 2;
Copied!
  1. SQL orders the sales within the ‘Sales’ department. Assigns 1, 2, 3…
  2. When it reaches the ‘IT’ department, the counter goes back to 1.
  3. Finally, we filter by Position <= 2 to get the podium for each area.

This technique is much more efficient and readable than using complex correlated subqueries to find “the maximum per group”.