In the previous article, we learned how to select columns with SELECT. However, if we run a SELECT without anything else, the database returns all the rows from the table.
If your table has ten records, it’s fine. But if you’re working with Amazon’s sales table, trying to fetch all the rows will block your application and leave the server crying for a while💥.
To avoid this, we need to filter which rows we want to retrieve. This is where the WHERE clause comes in.
The WHERE Clause
The WHERE clause specifies a condition that each row must meet to be included in the result.
It works like a nightclub bouncer: it analyzes each row, checks if it meets the rule, and if so, lets it pass. If not, it discards it.
The basic syntax is:
SELECT Column1, Column2
FROM Table
WHERE Condition;
For example, if we want to get only the customers who live in “Madrid”:
SELECT Name, LastName, City
FROM Customers
WHERE City = 'Madrid';
Filtering by Different Data Types
Depending on the type of information we store, the way we write the condition changes slightly.
This is the simplest. We do not use quotes.
SELECT * FROM Products
WHERE Price = 100;
As we saw in the basic syntax, in SQL, text literals must be enclosed in single quotes '.
SELECT * FROM Employees
WHERE LastName = 'Llamas';
Remember that SQL Server can be Case Sensitive or not, depending on the configured Collation.
In most default installations, WHERE LastName = 'llamas' would work the same, but in strict environments it would not.
Dates also go between single quotes. The safest format to avoid confusion (is 01/02 the 1st of February or the 2nd of January?) is the standard ISO format: 'YYYYMMDD' or 'YYYY-MM-DD'.
SELECT * FROM Orders
WHERE OrderDate = '20230915'; -- September 15, 2023
The Problem with NULLs (IS NULL)
In SQL, NULL means “absence of value” or “unknown”. NULL is not equal to anything, not even to another NULL.
If you try to do this, it will not work:
SELECT * FROM Customers
WHERE Phone = NULL; -- This always returns 0 rows
Since we cannot know if an unknown value is equal to another, the logical comparison fails (returns UNKNOWN).
To filter nulls, we must use the special operators IS NULL and IS NOT NULL.
-- Correct way: Customers without a phone
SELECT * FROM Customers
WHERE Phone IS NULL;
-- Correct way: Customers who DO have a phone
SELECT * FROM Customers
WHERE Phone IS NOT NULL;
Performance: Full Scan vs Index Seek
WHERE is not only useful for the application to receive less data; combined with indexes, it is the main tool for performance optimization.
- Without WHERE: The engine has to read the entire table (Full Table Scan). If the table has 100 million rows, it will read all 100 million.
- With WHERE: If we have an Index on the column we are filtering by (for example, ID), SQL Server can go directly to the row it’s looking for without reading the rest (Index Seek).
It’s the difference between reading an entire book to find a sentence or going directly to the index at the back and jumping to the correct page.
