A Subquery is a SELECT statement nested inside another SELECT, INSERT, UPDATE, or DELETE statement.
Sometimes, to answer a complex question, we first need to answer a simpler one. Imagine you want to know:
“Which products are more expensive than the average price?”.
To answer this, you first need to know what the average price is, and then compare each product against that value.
In T-SQL, we can do this in a single step using Subqueries. Basically, it’s putting one SELECT query inside another main query.
The subquery (the inner one) executes first, and its result is used by the outer query to complete its work.
Scalar Subqueries
These are the simplest. The subquery returns a single row and a single column (a single value). We can use it anywhere we would use a number or text (in the WHERE clause, in the SELECT list, etc.).
Let’s use an example: products more expensive than the average.
SELECT Nombre, Precio
FROM Productos
WHERE Precio > (
-- This is the subquery. It returns a single number (e.g., 50.00)
SELECT AVG(Precio)
FROM Productos
);
Here, SQL Server first calculates the average (say 50€) and then executes the outer query as if it were WHERE Precio > 50.
List Subqueries
Sometimes the subquery returns a list of values (one column, multiple rows). In this case, we cannot use operators like =, >, or <. We need to use the IN operator.
For example, customers who have purchased products from the “Electronics” category.
SELECT Nombre, Apellido
FROM Clientes
WHERE ClienteID IN (
-- Subquery: Returns a list of customer IDs
SELECT DISTINCT ClienteID
FROM Pedidos P
INNER JOIN DetallesPedido D ON P.PedidoID = D.PedidoID
INNER JOIN Productos Pr ON D.ProductoID = Pr.ProductoID
WHERE Pr.Categoria = 'Electrónica'
);
The subquery returns a list of IDs, and the outer query filters for customers whose ID is in that list.
Correlated Subqueries
In the previous examples, the subquery was independent: it executed once, returned a result, and that was it.
In a Correlated Subquery, the inner query depends on the outer query. This means the subquery executes once for each row of the main query.
For example, Employees who earn more than the average of their own department.
SELECT Nombre, Salario, DepartamentoID
FROM Empleados E1
WHERE Salario > (
SELECT AVG(Salario)
FROM Empleados E2
WHERE E2.DepartamentoID = E1.DepartamentoID -- This is the key part
);
Notice E2.DepartamentoID = E1.DepartamentoID.
- SQL takes the first employee (Juan, Sales).
- Passes the ‘Sales’ ID to the subquery.
- Calculates the average only for Sales.
- Compares Juan’s salary against that average.
- Repeats for the next employee.
Performance: Correlated subqueries can be slow on very large tables, because they force the engine to execute the subquery thousands of times (N+1 queries). Use them with care.
Subqueries in the FROM Clause
We can also use the result of a subquery as if it were a temporary table to run more queries on. This is called a Derived Table.
It is mandatory to give it an Alias.
SELECT T.Categoria, T.PromedioPrecio
FROM (
-- Virtual table generated on the fly
SELECT Categoria, AVG(Precio) as PromedioPrecio
FROM Productos
GROUP BY Categoria
) AS T -- The alias is mandatory here
WHERE T.PromedioPrecio > 100;
Although this works, in modern SQL we usually prefer to use CTEs (Common Table Expressions) for readability, something we’ll see in its own article, but the concept is the same.
