Double bind the foreign key to avoid unnecessary JOIN in TypeORM
Arnaud Denoyelle
Posted on March 23, 2024
ORM tools are cool but they generate SQL "automagically". The generated code tends to be unnecessarily complex and causes a performance overhead.
In a database, performance depends on the time to compute the information + the time to bring this information back to the server.
In this article, I will show how to reduce the computation time by removing some unnecessary JOINs and how to load only the information you need.
The model
In this article, we will use the simplest possible model :
- A table called
company
- A table called
employee
mapped like this :
@Entity({ name: "company" })
export class CompanyEntity extends BaseEntity {
@PrimaryGeneratedColumn({name: "id"})
id: string;
@Column({ name: "creation_date"})
creationDate: Date;
@Column({ name: "name"})
name: string;
}
@Entity({ name: "employee"})
export class EmployeeEntity extends BaseEntity {
@PrimaryGeneratedColumn()
id: string;
@Column({ name: "creation_date" })
creationDate: Date;
@Column({ name: "first_name" })
firstName: string;
@Column({ name: "last_name" })
lastName: string;
@JoinColumn({name: "company_id"})
@ManyToOne(() => CompanyEntity)
company: CompanyEntity;
}
Preparation : Activate logging on the local environment
To spot optimization problems early, you should always log the generated SQL at least in your local environment. You can do so using the logging
option from the TypeORM configuration
createTypeOrmOptions(): TypeOrmModuleOptions {
return {
type: "postgres",
host: "localhost",
port: 5432,
database: "acme",
username: "acme",
password: "acme",
schema: "acme",
logging: true, // <= Here !
entities: [
CompanyEntity,
EmployeeEntity
],
synchronize: false
}
}
In this way, TypeORM will output the generated SQL in logs, which is a huge help for debugging. Here is an example of output :
query: SELECT "EmployeeEntity"."id" AS "EmployeeEntity_id", "EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date", "EmployeeEntity"."first_name" AS "EmployeeEntity_first_name", "EmployeeEntity"."last_name" AS "EmployeeEntity_last_name", "EmployeeEntity"."company_id" AS "EmployeeEntity_company_id" FROM "acme"."employee" "EmployeeEntity"
In this article, we will rely on that to show the code generated by TypeORM
Double bind the foreign key
Suppose that your application has already loaded the list of companies and you want to get the list of employees for one specific company, knowing it's companyId
The naive implementation would be this :
findByCompanyId(companyId: string) {
return this.employeeRepository.findBy({
company: {
id: companyId
}
})
}
But let's have a look at the generated SQL :
SELECT "EmployeeEntity"."id" AS "EmployeeEntity_id",
"EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
"EmployeeEntity"."first_name" AS "EmployeeEntity_first_name",
"EmployeeEntity"."last_name" AS "EmployeeEntity_last_name",
"EmployeeEntity"."company_id" AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"
LEFT JOIN "acme"."company" "EmployeeEntity__EmployeeEntity_company"
ON "EmployeeEntity__EmployeeEntity_company"."id" = "EmployeeEntity"."company_id"
WHERE (((("EmployeeEntity__EmployeeEntity_company"."id" = $1)))
There is a JOIN
between the employee
and company
tables! This join is not necessary as the "company id" information is already present in the employee
table.
And here comes the double binding on the foreign key to solve that :
@Entity({ name: "employee"})
export class EmployeeEntity extends BaseEntity {
@PrimaryGeneratedColumn()
id: string;
// [...]
@JoinColumn({name: "company_id"})
@ManyToOne(() => CompanyEntity)
company: CompanyEntity;
// Here is where the magic happens
@Column({name: "company_id"})
companyId: string;
}
We let the company
relationship but add a companyId
field, which points to the same column.
Now we can rewrite the findByCompany
implementation like this :
findByCompanyId(companyId: string) {
return this.employeeRepository.findBy({
// Use the companyId "column" instead of the relationship
companyId: companyId
})
}
Now, the generated SQL looks like this :
SELECT "EmployeeEntity"."id" AS "EmployeeEntity_id",
"EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
"EmployeeEntity"."first_name" AS "EmployeeEntity_first_name",
"EmployeeEntity"."last_name" AS "EmployeeEntity_last_name",
"EmployeeEntity"."company_id" AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"
WHERE (("EmployeeEntity"."company_id" = $1))
No more unnecessary JOIN !
Bonus: "SELECT" only the information you need
In this example, the table employee
contains few columns but, in a real-world example, it could contain dozens of columns that you don't need every time. Especially columns that could contain sensible information
Suppose that I just fetch all employees like this:
this.employeeRepository.find();
Here is the generated sql :
SELECT "EmployeeEntity"."id" AS "EmployeeEntity_id",
"EmployeeEntity"."creation_date" AS "EmployeeEntity_creation_date",
"EmployeeEntity"."first_name" AS "EmployeeEntity_first_name",
"EmployeeEntity"."last_name" AS "EmployeeEntity_last_name",
"EmployeeEntity"."company_id" AS "EmployeeEntity_company_id"
FROM "acme"."employee" "EmployeeEntity"
But we might not need all the columns. TypeORM lets you decide which columns you want to fetch, like this :
this.employeeRepository.find({
select: {
firstName: true,
lastName: true
}
});
Note : we don't call findBy
anymore but find
which allows us to pass other useful parameters such as take
, skip
and relations
(this one is for lazy/eager loading), and of course, the where
clause.
And here is the corresponding SQL :
SELECT "EmployeeEntity"."first_name" AS "EmployeeEntity_first_name",
"EmployeeEntity"."last_name" AS "EmployeeEntity_last_name",
"EmployeeEntity"."id" AS "EmployeeEntity_id"
FROM "acme"."employee" "EmployeeEntity"
Note : the return type will still be a EmployeeEntity[]
but the non-required fields are left undefined
Conclusion
Double binding on the foreign key is so useful that I now use it on nearly every foreign key.
Also, select
can be useful on main tables, especially the user
table, which often contains sensible information that you don't want to expose outside of the database if not necessary.
I hope that those tips will help you reduce the performance overhead on your database!
Posted on March 23, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.