When working with Entity Framework, one of its strengths is the ability to load related data between entities.
When working with relational databases, it’s common for entities to be related to each other (that’s why they’re called “relational” 🤗).
For example, an Order entity may be related to a Customer entity, and in turn, a Customer can have multiple Orders.
When we perform a query to get an Order, we might also need information about the associated Customer.
Entity Framework allows loading this related data in a simple way using the Include and ThenInclude methods.
Let’s look at each of these methods 👇.
Using Include
The Include method is used to load related data from an entity in a single query. Its syntax is simple:
var orders = context.Orders
.Include(o => o.Customer)
.ToList();
In this example,
- We are getting all
Ordersand, additionally, loading the information of theCustomerassociated with eachOrder. - This translates into a single SQL query that includes a
JOINbetween theOrdersandCustomerstables.
Practical Example
Let’s imagine we have the following entities:
public class Customer
{
public int CustomerId { get; set; }
public string Name { get; set; }
public ICollection<Order> Orders { get; set; }
}
public class Order
{
public int OrderId { get; set; }
public DateTime Date { get; set; }
public int CustomerId { get; set; }
public Customer Customer { get; set; }
}
If we want to get all Orders along with the information of the associated Customer, we can do it as follows:
var orders = context.Orders
.Include(o => o.Customer)
.ToList();
This will generate an SQL query similar to:
SELECT o.*, c.*
FROM Orders o
INNER JOIN Customers c ON o.CustomerId = c.CustomerId;
Using Nested Relationships with ThenInclude
In some cases, the relationships between entities are more complex and can involve multiple levels. To load nested related data, we use the ThenInclude method.
- Include performs
joinrelationships with the first-level table - ThenInclude performs subsequent
joinrelationships with previously loaded tables
Practical Example
For example, an Order may be related to a Customer, and this in turn may have an Address.
Suppose we have an Address entity related to Customer:
public class Address
{
public int AddressId { get; set; }
public string Street { get; set; }
public string City { get; set; }
public int CustomerId { get; set; }
public Customer Customer { get; set; }
}
If we want to get all Orders along with the information of the Customer and their Addresses, we can do it as follows:
var orders = context.Orders
.Include(o => o.Customer)
.ThenInclude(c => c.Addresses)
.ToList();
This will generate an SQL query that includes a JOIN between the Orders, Customers, and Addresses tables.
Performance Considerations
Relationships, and how to work with them, have always been one of the biggest creators of potential performance problems when working with databases.
With Entity Framework, as with any other technology for connecting to databases, we also have to be aware of this.
Include performs an eager load of the related data. That is, it downloads the data and all the related data at once.
This avoids the problem known as N+1 queries, which means that multiple queries are made to the database to get related data (which can significantly affect performance).
But, on the other hand, you can also end up downloading a huge amount of data. Which also results in generating a huge and slow SQL query.
In some cases, it may be more efficient to use lazy loading or explicit loading, depending on the scenario.
As a general rule of thumb download everything you need, and only what you need, in the fewest number of queries possible when working with databases.
