entity-framework-herencia

How to Use Inheritance in Entity Framework

  • 6 min

In Entity Framework, inheritance allows modeling relationships between entities so that a base class can be extended by one or more derived classes.

Inheritance is a fundamental concept in object-oriented programming (OOP) that allows a derived class to inherit properties and behaviors from another class.

However, since relational databases do not directly support inheritance, Entity Framework uses mapping strategies to represent these relationships in tables.

In this article, we will look at these strategies, their advantages, disadvantages, and when it is appropriate to use each one 👇.

Inheritance Strategies

The three main strategies are:

  • TPH (Table Per Hierarchy): All classes in an inheritance hierarchy are stored in a single table.
  • TPT (Table Per Type): Each class in the inheritance hierarchy is stored in its own table.
  • TPC (Table Per Concrete Class): Each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.

Here is a summary of their main characteristics.

CharacteristicTPHTPTTPC
Number of tables🟢 1🔴 N (one for each type)🔴 N (one for each class)
Performance🟢 High🟡 Medium🟢 High
Normalization🔴 Low🟢 High🟡 Medium
Space usage🔴 Inefficient🟢 Efficient🟡 Redundant
Schema complexity🟢 Simple🔴 Complex🔴 Complex

Classes for the Example

For the examples, let’s assume we have a class hierarchy where Animal is the base class and Dog and Cat are derived classes.

public class Animal
{
    public int Id { get; set; }
    public string Name { get; set; }
}

public class Dog : Animal
{
    public string Breed { get; set; }
}

public class Cat : Animal
{
    public bool HasClaws { get; set; }
}

You will forgive the lack of originality in using animals as an example of inheritance 😉

Table Per Hierarchy (TPH)

TPH (Table Per Hierarchy) is a strategy where all classes in an inheritance hierarchy are stored in a single table.

This table includes a discriminator column that indicates the type of entity each row represents.

In our example, in TPH, all these entities would be stored in a single table called Animals with a discriminator column:

CREATE TABLE Animals (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    Breed NVARCHAR(100),
    HasClaws BIT,
    Discriminator NVARCHAR(50)
);
  • A single table, with all columns
  • The Discriminator column will indicate whether the row corresponds to a Dog or a Cat.

Table Per Type (TPT)

TPT (Table Per Type) is a strategy where each class in the inheritance hierarchy is stored in its own table. The tables are related by foreign keys.

Using the same example of Animal, Dog, and Cat, in TPT we would have three tables:

CREATE TABLE Animals (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100)
);

CREATE TABLE Dogs (
    Id INT PRIMARY KEY,
    Breed NVARCHAR(100),
    FOREIGN KEY (Id) REFERENCES Animals(Id)
);

CREATE TABLE Cats (
    Id INT PRIMARY KEY,
    HasClaws BIT,
    FOREIGN KEY (Id) REFERENCES Animals(Id)
);
  • One table for each class (including the base class)
  • Each with its properties (common properties in the base class)
  • Relationships between them

Table Per Concrete Class (TPC)

TPC (Table Per Concrete Class) is a strategy where each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.

Using the same example of Animal, Dog, and Cat, in TPC we would have two tables:

CREATE TABLE Dogs (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    Breed NVARCHAR(100)
);

CREATE TABLE Cats (
    Id INT PRIMARY KEY,
    Name NVARCHAR(100),
    HasClaws BIT
);
  • Only the derived classes have tables
  • The derived tables have the common columns duplicated

When to Use TPH, TPT, or TPC?

Ideal when the inheritance hierarchy is simple and there are not many properties specific to each type. It is the most common option due to its simplicity and performance.

Disadvantages of TPH

  • Wasted space: Columns that are not applicable to all entities may contain null values, leading to inefficient space usage.

Recommended when a normalized database schema is needed. It is useful in scenarios where the properties specific to each type are significant.

Disadvantages of TPT

  • Performance: Queries may be slower due to joins between tables.

Suitable when optimal performance is needed and data redundancy is not a concern. It is a good middle ground between the two previous ones.

Disadvantages of TPC

  • Redundancy: Properties of the base class are duplicated in each table.