filtrado-grupos-having.sql

Filtering Groups with HAVING in SQL

  • 3 min

In the previous article, we learned to group data with GROUP BY. Now we are able to calculate, for example, the total sales per employee.

But, what if we want to go a step further and ask:

Tell me which employees have sold less than €10,000

No one is going to fire anyone 🙄

Our intuition leads us to use WHERE, which is the filtering tool we know.

-- ⛔ THIS WILL ERROR
SELECT EmpleadoID, SUM(Ventas)
FROM Pedidos
WHERE SUM(Ventas) > 10000 -- ¡Error!  
GROUP BY EmpleadoID;
Copied!

SQL Server will throw an error. Why? Because WHERE is executed before grouping.

Because at that point, the engine has not yet calculated what the SUM(Ventas) is for anyone.

To filter based on the result of an aggregation, we need the HAVING clause.

The HAVING Clause

HAVING works exactly like WHERE, but it is applied after the groups have been created and their aggregates calculated.

The correct syntax for the previous example would be:

SELECT EmpleadoID, SUM(Ventas) as TotalVentas
FROM Pedidos
GROUP BY EmpleadoID
HAVING SUM(Ventas) > 10000;
Copied!

Now it works:

  1. SQL Server groups sales by employee.
  2. Calculates the sum for each one.
  3. Then, discards the groups that do not exceed 10,000.

WHERE vs HAVING: The Big Difference

Confusing these two is one of the most common mistakes. The difference lies in when they are applied during query processing.

The order of execution is:

FROM: Loads the table.

WHERE: Filters individual rows (before grouping).

GROUP BY: Creates groups with the rows that passed the filter.

HAVING: Filters the complete groups (based on aggregates).

SELECT: Displays the columns.

  • Are you filtering by an original column from the table? Use WHERE.
  • Are you filtering by a calculation (COUNT, SUM, AVG)? Use HAVING.

Combining WHERE and HAVING

WHERE and HAVING are not enemies. They can (and often are) be used together. To master this, let’s look at an example where we use both.

For example, we want to know:

Which Product Categories have generated more than €5,000 in sales

but only considering sales from the year 2023.

Translated into “SQL-ese,” this would be something like:

SELECT Categoria, SUM(Importe) as TotalVentas
FROM Ventas
WHERE Anio = 2023              -- 1. Filter rows (only year 2023)
GROUP BY Categoria             -- 2. Group what remains
HAVING SUM(Importe) > 5000;    -- 3. Filter the resulting groups
Copied!
  • The WHERE removes old data (2022, 2021…) before the engine wastes time grouping and summing it.
  • The HAVING only deals with checking the final total.

Technically, you could put row conditions in HAVING (like HAVING Anio = 2023 if you group by Year). But it’s bad practice.

You force SQL Server to group data that you will later discard. Always filter as early as possible: use WHERE for everything you can.

Multiple Conditions in HAVING

Just like in WHERE, we can use logical operators AND and OR in HAVING.

SELECT Departamento, COUNT(*) as NumEmpleados, AVG(Salario) as SalarioMedio
FROM Empleados
GROUP BY Departamento
HAVING COUNT(*) > 10          -- Large departments
   AND AVG(Salario) < 30000;  -- And with low average salary

Copied!