In the previous article, we saw how to number rows using window functions. But the Partitions clause isn’t just for creating rankings.
Its true power lies in combining it with aggregation functions (SUM, AVG, COUNT, MAX, MIN), which allows us to perform complex calculations on datasets without the need to group them explicitly.
You’ve likely encountered this problem: You want to list all products with their price, and next to it, the average price of their category to see if it’s expensive or cheap.
With what we knew until now, you’d have to write a complex subquery or a JOIN with a grouped table. With Window Aggregation Functions, this is very simple.
Aggregation without Grouping
The key conceptual difference is this:
- GROUP BY: Collapses rows. If you have 100 products, you end up with 5 rows of categories. You lose the detail.
- OVER: Maintains the 100 product rows, but adds an extra column with the aggregated calculation repeated in each row.
Totals by group (PARTITION BY)
Imagine we want to see each individual sale, but also want to know how much that employee has sold in total, to calculate what percentage that specific sale represents.
We use
SUM(Column) OVER (PARTITION BY Group ORDER BY Order)
SELECT
Employee,
Date,
Amount,
-- Total sum of ALL sales in the table
SUM(Amount) OVER () as GlobalTotal,
-- Total sum ONLY for this employee
SUM(Amount) OVER (PARTITION BY Employee) as EmployeeTotal,
-- Direct calculation: % this sale represents of the employee's total
Amount / SUM(Amount) OVER (PARTITION BY Employee) * 100 as Percentage
FROM Sales;
Result:
| Employee | Amount | GlobalTotal | EmployeeTotal | Percentage |
|---|---|---|---|---|
| Ana | 100€ | 900€ | 300€ | 33.3% |
| Ana | 200€ | 900€ | 300€ | 66.6% |
| Luis | 600€ | 900€ | 600€ | 100% |
Notice that we didn’t use GROUP BY. We have the detail of each sale (100€, 200€) and next to it the aggregated context (300€).
Running totals
To get a running total, we simply add ORDER BY inside the OVER clause.
When we use ORDER BY in an aggregation function, SQL Server changes the meaning of “the whole group” to “from the beginning of the group to the current row”.
SELECT
Date,
Amount,
SUM(Amount) OVER (ORDER BY Date) as RunningTotal
FROM AnnualSales;
Result:
| Date | Amount | RunningTotal |
|---|---|---|
| 01-Jan | 100€ | 100€ |
| 02-Jan | 50€ | 150€ (100+50) |
| 03-Jan | 200€ | 350€ (150+200) |
Performance Note: By default, when using ORDER BY, SQL uses a window frame called RANGE UNBOUNDED PRECEDING.
This can be slow on huge tables. In these cases, it’s common to specify ROWS UNBOUNDED PRECEDING, which is faster.
Moving averages
A moving average is used to smooth out fluctuating data (like a stock price or daily sales). For example, calculating the average of the last 3 days.
For this, we need to explicitly define the window “frame” using ROWS BETWEEN.
SELECT
Date,
Sales,
AVG(Sales) OVER (
ORDER BY Date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) as MovingAverage3Days
FROM DailySales;
This tells SQL: “To calculate today’s average, take the current row and the 2 preceding ones”.
