entity-framework-include

Getting Related Data in Entity Framework with Include and ThenInclude

  • 4 min

When working with Entity Framework, one of its strengths is the ability to load related data between entities.

When we work with relational databases, it is common for entities to be related to each other (that’s why they are called “relational” 🤗).

For example, an entity Order may be related to an entity Customer, and in turn, a Customer can have multiple Orders.

When we perform a query to obtain an Order, we may also need information about the associated Customer.

Entity Framework allows loading these related data easily using the Include and ThenInclude methods.

Let’s see each of these methods 👇.

Using Include

The Include method is used to load related data of an entity in a single query. Its syntax is straightforward:

var orders = context.Orders
    .Include(o => o.Customer)
    .ToList();

In this example,

  • We are retrieving all Orders and, additionally, loading the information of the associated Customer for each Order.
  • This translates to a single SQL query that includes a JOIN between the Orders and Customers tables.

Practical example

Let’s imagine we have the following entities:

public class Customer
{
    public int CustomerId { get; set; }
    public string Name { get; set; }
    public ICollection<Order> Orders { get; set; }
}

public class Order
{
    public int OrderId { get; set; }
    public DateTime Date { get; set; }
    public int CustomerId { get; set; }
    public Customer Customer { get; set; }
}

If we want to retrieve all Orders along with the information of the associated Customer, we can do it as follows:

var orders = context.Orders
    .Include(o => o.Customer)
    .ToList();

This will generate a SQL query similar to:

SELECT o.*, c.*
FROM Orders o
INNER JOIN Customers c ON o.CustomerId = c.CustomerId;

Loading multiple relationships

In some cases, you may need to load multiple relationships in a single query. For example, if an Order has a Customer and a Product, you can load both relationships as follows:

var orders = context.Orders
    .Include(o => o.Customer)
    .Include(o => o.Product)
    .ToList();

Conditional loading with relationships

Of course, we can perform conditional loads by applying Where after Include, applying a condition to the relational table. That is, you can do,

var orders = context.Orders
    .Include(o => o.Customer)
    .Where(o => o.Date.Year == 2023)
    .ToList();

Using nested relationships with ThenInclude

In some cases, the relationships between entities are more complex and can involve multiple levels. For loading nested related data, we use the ThenInclude method.

  • Include performs join relationships with the first-level table
  • ThenInclude performs subsequent join relationships with the previously loaded tables

Practical example

For example, an Order may be related to a Customer, and this Customer may have an Address.

Suppose we have an Address entity related to Customer:

public class Address
{
    public int AddressId { get; set; }
    public string Street { get; set; }
    public string City { get; set; }
    public int CustomerId { get; set; }
    public Customer Customer { get; set; }
}

If we want to retrieve all Orders along with the information of the Customer and their Addresses, we can do it as follows:

var orders = context.Orders
    .Include(o => o.Customer)
        .ThenInclude(c => c.Addresses)
    .ToList();

This will generate a SQL query that includes a JOIN between the Orders, Customers, and Addresses tables.

Performance considerations

Relationships, and how to work with them, have always been one of the biggest creators of potential performance issues when working with databases.

With Entity Framework, just like with any other technology for connecting to a database, we also need to be mindful of this.

Include performs an eager load of related data. That is, it retrieves the data, and all related data at once.

This avoids the problem known as N+1 queries, which means multiple queries are made to the database to obtain related data (which can significantly impact performance).

However, on the other hand, you can also end up downloading a lot of data. This also results in generating a large and slow SQL query.

In some cases, it may be more efficient to use lazy loading or explicit loading, depending on the scenario.

In general, as a rule of thumb, load everything you need, and only what you need, in the least number of queries possible when working with databases.