An index is a data structure that improves the speed of data retrieval operations on a table.
Imagine you have a 1,000-page textbook and I ask you to find the definition of “Polymorphism”. You have two options:
- Full Scan: You start on page 1, read page 2, page 3… until (hopefully) on page 750 you find it.
- Index Seek: You go to the back of the book, look at the alphabetical index, search for “P”, find “Polymorphism: page 750” and go directly there.
The same happens in SQL Server. If you don’t have indexes, the engine performs a Full Table Scan (reads the entire table). If you have indexes, it goes directly to the data. The time difference can be minutes vs milliseconds.
In SQL Server, there are two fundamental types you must differentiate: Clustered and Non-Clustered.
| Characteristic | Clustered | Non-Clustered |
|---|---|---|
| Quantity | Maximum 1 per table | Multiple per table |
| Physical | Physically sorts the data | Separate logical structure |
| Content | Contains all the row data | Contains the key + pointer |
| Speed | Fastest for reading | Fast, but may require a Lookup |
| Ideal Use | Primary Key, range columns (Dates), frequently sorted columns | Foreign keys, columns used in frequent WHERE filters |
Let’s look at them in detail 👇
Clustered Index
The Clustered index is the table itself. That is, it determines the physical order in which data is stored on disk.
Think of an address book. The data itself (first and last names) is sorted alphabetically. There is no separate list; the book itself is the sorted list.
- There can only be ONE Clustered Index per table. (Logically, you cannot physically order the pages of a book by Last Name and by Date of Birth at the same time).
Relationship with the Primary Key
By default, when you create a PRIMARY KEY, SQL Server automatically creates a Clustered index on that column. That’s why searches by ID are so fast.
-- Create a Clustered index explicitly (if no PK existed)
CREATE CLUSTERED INDEX IX_Clientes_Email
ON Clientes(Email);
Choose your Primary Key carefully, it’s usually the column you search by the most (ID, Date, Code).
If you create a Clustered Index on a random column (like Estado), the entire table will be physically reordered on disk.
Non-Clustered Index
The Non-Clustered index is a structure separate from the main table. It contains the values of the indexed column and a pointer to the actual row.
Continuing with the examples, it’s the index at the back of a book. It’s a separate list sorted alphabetically that tells you: “Concept X is on page 5”. You look it up in the index and then have to go to the page (this is called a Lookup).
- You can have multiple Non-Clustered indexes on a table (up to 999, although I don’t recommend it 😊).
They are ideal for columns you frequently use in WHERE or JOIN clauses but which are not the primary key.
-- Create an index to search quickly by Last Name
CREATE NONCLUSTERED INDEX IX_Clientes_Apellido
ON Clientes(Apellido);
Now, if you run SELECT * FROM Clientes WHERE Apellido = 'García', SQL will use this index instead of reading the entire table.
Technically, an index is a data structure (usually a B-Tree or B+ Tree) that stores the values of one or more columns in a sorted order, along with pointers that tell us where to find the rest of the information.
Why not index everything? The Cost
If indexes are so good, why don’t we put one on every column? Because indexes are not free.
- Space: They take up space on disk and in RAM.
- Maintenance (The most important part): Every time you do an
INSERT,UPDATE, orDELETE, SQL Server has to update the table AND ALSO all the indexes that point to that table.
- A table with 20 indexes will be very fast to read (
SELECT), but slow to write.
