We have already seen how to group rows with GROUP BY. But those groups are useless if we don’t calculate something with them.
Seriously, think about it for a moment. Really, a group is useless if you can’t do a “Count”, or “Sum” or “Exists”… or whatever.
This is where Aggregate Functions come in. An aggregate function takes a collection of values and performs a calculation to return a single value.
They are the fundamental mathematical operations for any report or data analysis. Let’s look at the five “magnificent” ones you’ll use in 99% of your queries.
Counting with COUNT
Its goal is to count elements. It’s the most basic function, but it has two “flavors” that often cause confusion.
COUNT(*): Counts rows. It doesn’t care what’s inside, it counts how many total records there are. Includes nulls.COUNT(Column): Counts values. Evaluates the column and counts how many data points there are. Ignores NULLs.
-- Imagine an 'Employees' table with 10 rows.
-- 2 employees don't have an email (Email is NULL).
SELECT
COUNT(*) AS TotalRows, -- Returns 10
COUNT(Email) AS TotalEmails -- Returns 8
FROM Employees;
If you just want to know “how many records are there?”, always use COUNT(*).
Sometimes it’s said not to use it because it’s slower. False, it’s highly optimized by the SQL Server engine.
Summing with SUM
As its name suggests, it sums all values in a numeric column.
SELECT SUM(Price) AS TotalRevenue
FROM Sales;
Behavior with NULL: SUM ignores null values. If you sum 10 + NULL + 20, the result is 30 (not NULL, as would happen in a simple arithmetic sum 10 + NULL).
If all values are NULL, it returns NULL.
Averaging with AVG
Calculates the arithmetic mean (sum divided by count).
SELECT AVG(Salary) AS AverageSalary
FROM Employees;
Minimum and Maximum with MIN and MAX
These functions return the lowest and highest value in a set, respectively.
SELECT
MIN(Price) AS CheapestProduct,
MAX(RegistrationDate) AS LastRegisteredUser,
MIN(LastName) AS FirstLastNameAlphabetically
FROM Data;
The great thing is that they don’t just work with numbers.
- Numbers: The lowest/highest mathematical value.
- Dates: The oldest date (
MIN) or the most future date (MAX). - Text: The first (
MIN) or last (MAX) in alphabetical order.
Aggregations with DISTINCT
Sometimes we don’t want to count everything, but count unique things. All aggregate functions accept the DISTINCT modifier inside their parentheses.
The most common use case is with COUNT.
-- How many sales have we made?
SELECT COUNT(SaleID) FROM Sales; -- e.g., 100 sales
-- How many DIFFERENT customers have bought from us?
SELECT COUNT(DISTINCT CustomerID) FROM Sales; -- e.g., 45 customers
Handling Nulls
It’s vital to remember how each one behaves in the face of that great enemy, NULL.
| Function | Behavior with NULL |
|---|---|
COUNT(*) | Counts them (Counts the row) |
COUNT(Col) | Ignores them |
SUM(Col) | Ignores them |
AVG(Col) | Ignores them (They don’t count in the numerator or denominator) |
MIN/MAX | Ignores them |
Combined Example
To finish, let’s look at a typical Dashboard query that uses everything learned:
SELECT
Category,
COUNT(*) AS TotalProducts,
SUM(Stock) AS TotalInventory,
MAX(Price) AS HighestPrice,
AVG(CAST(Price AS DECIMAL(10,2))) AS AveragePrice
FROM Products
GROUP BY Category;
