In Entity Framework, inheritance allows modeling relationships between entities so that a base class can be extended by one or more derived classes.
Inheritance is a fundamental concept in object-oriented programming (OOP) that allows a class to derive from another, inheriting its properties and behaviors.
However, since relational databases do not directly support inheritance, Entity Framework uses mapping strategies to represent these relationships in tables.
In this article, we will look at these strategies, their advantages, disadvantages, and when it is appropriate to use each one 👇.
Inheritance Strategies
The three main strategies are:
- TPH (Table Per Hierarchy): All classes in an inheritance hierarchy are stored in a single table.
- TPT (Table Per Type): Each class in the inheritance hierarchy is stored in its own table.
- TPC (Table Per Concrete Class): Each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.
Here is a summary of their main characteristics.
| Characteristic | TPH | TPT | TPC |
|---|---|---|---|
| Number of Tables | 🟢 1 | 🔴 N (one per type) | 🔴 N (one per concrete class) |
| Performance | 🟢 High | 🟡 Medium | 🟢 High |
| Normalization | 🔴 Low | 🟢 High | 🟡 Medium |
| Space Usage | 🔴 Inefficient | 🟢 Efficient | 🟡 Redundant |
| Schema Complexity | 🟢 Simple | 🔴 Complex | 🔴 Complex |
Classes for the Examples
For the examples, let’s assume we have a class hierarchy where Animal is the base class and Dog (Dog) and Cat (Cat) are derived classes.
public class Animal
{
public int Id { get; set; }
public string Name { get; set; }
}
public class Dog : Animal
{
public string Breed { get; set; }
}
public class Cat : Animal
{
public bool HasClaws { get; set; }
}
Forgive the lack of originality in using animals as an inheritance example 😉
Table Per Hierarchy (TPH)
TPH (Table Per Hierarchy) is a strategy where all classes in an inheritance hierarchy are stored in a single table.
This table includes a discriminator column that indicates the type of entity each row represents.
In our example, in TPH, all these entities would be stored in a single table called Animals with a discriminator column:
CREATE TABLE Animals (
Id INT PRIMARY KEY,
Name NVARCHAR(100),
Breed NVARCHAR(100),
HasClaws BIT,
Discriminator NVARCHAR(50)
);
- A single table, with all columns
- The
Discriminatorcolumn will indicate whether the row corresponds to aDogor aCat.
Table Per Type (TPT)
TPT (Table Per Type) is a strategy where each class in the inheritance hierarchy is stored in its own table. The tables are related via foreign keys.
Using the same example of Animal, Dog, and Cat, in TPT we would have three tables:
CREATE TABLE Animals (
Id INT PRIMARY KEY,
Name NVARCHAR(100)
);
CREATE TABLE Dogs (
Id INT PRIMARY KEY,
Breed NVARCHAR(100),
FOREIGN KEY (Id) REFERENCES Animals(Id)
);
CREATE TABLE Cats (
Id INT PRIMARY KEY,
HasClaws BIT,
FOREIGN KEY (Id) REFERENCES Animals(Id)
);
- One table for each class (including the base class)
- Each with its properties (common properties in the base class)
- Relationships between them
Table Per Concrete Class (TPC)
TPC (Table Per Concrete Class) is a strategy where each concrete class in the inheritance hierarchy is stored in its own table, but the properties of the base class are duplicated in each table.
Using the same example of Animal, Dog, and Cat, in TPC we would have two tables:
CREATE TABLE Dogs (
Id INT PRIMARY KEY,
Name NVARCHAR(100),
Breed NVARCHAR(100)
);
CREATE TABLE Cats (
Id INT PRIMARY KEY,
Name NVARCHAR(100),
HasClaws BIT
);
- Only derived classes have a table
- The derived tables have the common columns duplicated
When to Use TPH, TPT, or TPC?
Ideal when the inheritance hierarchy is simple and there aren’t many properties specific to each type. It’s the most common option due to its simplicity and performance.
Advantages of TPH
- Simplicity: Only one table is needed to store all entities in the hierarchy.
- Performance: Queries are faster since no joins between tables are required.
Disadvantages of TPH
- Wasted space: Columns that are not applicable to all entities may contain null values, which can lead to inefficient space usage.
Recommended when a normalized and clear database schema is needed. Useful in scenarios where properties specific to each type are significant.
Advantages of TPT
- Normalization: Each table contains only the properties specific to its type, reducing redundancy.
- Schema clarity: The database schema is clearer and easier to understand.
Disadvantages of TPT
- Performance: Queries can be slower due to joins between tables.
Suitable when optimal performance is needed and data redundancy is not a concern. It’s a good middle ground between the previous two.
Advantages of TPC
- Partial normalization: Each table contains only the properties specific to its type, but the base class properties are duplicated.
- Performance: Queries can be faster than in TPT since no joins are required to access base class properties.
Disadvantages of TPC
- Redundancy: The base class properties are duplicated in each table.
