entity-framework-herencia

How to Use Inheritance in Entity Framework

  • 6 min

In Entity Framework, inheritance allows modeling relationships between entities so that a base class can be extended by one or more derived classes.

Inheritance is a fundamental concept in object-oriented programming (OOP) that allows a class to derive from another, inheriting its properties and behaviors.

However, since relational databases do not directly support inheritance, Entity Framework uses mapping strategies to represent these relationships in tables.

In this article, we will look at these strategies, their advantages, disadvantages, and when it is appropriate to use each one 👇.

Inheritance Strategies

The three main strategies are:

  • TPH (Table Per Hierarchy): All classes in an inheritance hierarchy are stored in a single table.
  • TPT (Table Per Type): Each class in the inheritance hierarchy is stored in its own table.
  • TPC (Table Per Concrete Class): Each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.

Here is a summary of their main characteristics.

CharacteristicTPHTPTTPC
Number of Tables🟢 1🔴 N (one per type)🔴 N (one per concrete class)
Performance🟢 High🟡 Medium🟢 High
Normalization🔴 Low🟢 High🟡 Medium
Space Usage🔴 Inefficient🟢 Efficient🟡 Redundant
Schema Complexity🟢 Simple🔴 Complex🔴 Complex

Classes for the Examples

For the examples, let’s assume we have a class hierarchy where Animal is the base class and Dog (Dog) and Cat (Cat) are derived classes.

public class Animal
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Dog : Animal
{
    public string Breed { get; set; }
}

public class Cat : Animal
{
    public bool HasClaws { get; set; }
}
Copied!

Forgive the lack of originality in using animals as an inheritance example 😉

Table Per Hierarchy (TPH)

TPH (Table Per Hierarchy) is a strategy where all classes in an inheritance hierarchy are stored in a single table.

This table includes a discriminator column that indicates the type of entity each row represents.

In our example, in TPH, all these entities would be stored in a single table called Animals with a discriminator column:

CREATE TABLE Animals (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    Breed NVARCHAR(100),
    HasClaws BIT,
    Discriminator NVARCHAR(50)
);
Copied!
  • A single table, with all columns
  • The Discriminator column will indicate whether the row corresponds to a Dog or a Cat.

Table Per Type (TPT)

TPT (Table Per Type) is a strategy where each class in the inheritance hierarchy is stored in its own table. The tables are related via foreign keys.

Using the same example of Animal, Dog, and Cat, in TPT we would have three tables:

CREATE TABLE Animals (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100)
);

CREATE TABLE Dogs (
    Id INT PRIMARY KEY,
    Breed NVARCHAR(100),
    FOREIGN KEY (Id) REFERENCES Animals(Id)
);

CREATE TABLE Cats (
    Id INT PRIMARY KEY,
    HasClaws BIT,
    FOREIGN KEY (Id) REFERENCES Animals(Id)
);
Copied!
  • One table for each class (including the base class)
  • Each with its properties (common properties in the base class)
  • Relationships between them

Table Per Concrete Class (TPC)

TPC (Table Per Concrete Class) is a strategy where each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.

Using the same example of Animal, Dog, and Cat, in TPC we would have two tables:

CREATE TABLE Dogs (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    Breed NVARCHAR(100)
);

CREATE TABLE Cats (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    HasClaws BIT
);
Copied!
  • Only derived classes have a table
  • The derived tables have the common columns duplicated

When to Use TPH, TPT, or TPC?

Ideal when the inheritance hierarchy is simple and there aren’t many properties specific to each type. It’s the most common option due to its simplicity and performance.

Disadvantages of TPH

  • Wasted space: Columns that are not applicable to all entities may contain null values, which can lead to inefficient space usage.

Recommended when a normalized and clear database schema is needed. Useful in scenarios where properties specific to each type are significant.

Disadvantages of TPT

  • Performance: Queries can be slower due to joins between tables.

Suitable when optimal performance is needed and data redundancy is not a concern. It’s a good middle ground between the previous two.

Disadvantages of TPC

  • Redundancy: The base class properties are duplicated in each table.