A journey to simplify debugging: Automate generating human-friendly database constraints using TypeORM
Emtiaj Hasan
Posted on August 6, 2023
One fine morning, I watched the observability service DataDog and noticed the database-related errors below.
insert or update on table "User" violates foreign key constraint "FK_eebe9258d224f7861dd124c1814"
duplicate key value violates unique constraint "UQ_b0f76d1d1fc4ef7f1e3b627574d"
duplicate key value violates unique constraint "PK_e03f3cb1b058a7463f8278a4f97"
Let’s try to analyze errors so that we can take action accordingly.
The first error message hints that we were trying to insert a user with an invalid value to a particular column, referencing another table. But we need to find out which column caused the error.
The Prefix UQ
suggests that a row in the table already contains the same value we were trying to insert. Does it show for which column? Unfortunately, no.
The third message is the same as the unique constraint. It happened for the primary key. But it does not give a clue about the name of the primary column.
To get an elaborate idea, I turned to the database administrator tool, checked the entity-relationship diagram, and reviewed all the database constraints to decipher these error messages.
This process left me exasperated and seemed like a waste of time. But then a thought struck me – what if database constraints were more human-readable and provided meaningful information for easier debugging?
I started thinking about applying a convention on the DB keys.
Readable Naming Convention
I use PostgreSQL backed by TypeORM. The keys mentioned here were (auto)generated by the TypeORM. Although TypeORM uses a cryptic and weird naming convention, it provides a way to override the default naming strategy. That means I can use this feature.
I made a convention for myself. Each key should have a prefix, including the table name it belongs to and the column name it is created on. Let’s give an example of each type of DB key.
Primary Key
PK_User_id
- Representing the primary key id
in the User
table.
Foreign Key
FK_User_roleId_Role_id
- Signifying a foreign key roleId
in the User
table referencing the id
column in the Role
table.
Unique Key
UQ_Flashcard_userId_vocabularyId
- The Flashcard
table has a composite unique key on columns userId
and vocabularyId
.
Index Key
IDX_Product_code
- Here, the Product
table has an index over the column code
.
Project setup
Before writing the codes to design a naming strategy, let’s do the basic setup.
I will use NestJS, PostgreSQL, and TypeORM as usual. I aim to design a simple DB with only three tables: Cohort
, Role
, and User
.
Caption: ER diagram of the DB
I simply followed the TypeORM documentation and managed to create entity classes like the ones below.
@Entity('Cohort')
export class Cohort {
@PrimaryGeneratedColumn('uuid')
id: string;
@Column({ type: 'varchar', nullable: false, unique: true })
name: string;
@OneToMany(() => User, (user) => user.role, { eager: false, cascade: false })
users: User[];
}
@Entity('Role')
@Index(['name'])
export class Role {
@PrimaryGeneratedColumn('uuid')
id: string;
@Column({ type: 'varchar', nullable: false })
name: string;
@Column({ type: 'jsonb', nullable: false })
permission: RolePermission;
@OneToMany(() => User, (user) => user.role, { eager: false, cascade: false })
users: User[];
}
@Entity('User')
@Index(['reference'], {
where: `"deletedAt" IS NULL`,
unique: true,
})
export class User {
@PrimaryGeneratedColumn('uuid')
id: string;
@Column({ type: 'varchar', nullable: false })
name: string;
@Column({ type: 'varchar', nullable: false })
reference: string;
@DeleteDateColumn({ type: 'timestamp with time zone', nullable: true })
deletedAt: Date;
@ManyToOne(() => Cohort, (cohort) => cohort.users, { eager: false, nullable: false })
@JoinColumn({ name: 'cohortId' })
cohort: Cohort;
@ManyToOne(() => Role, (role) => role.users, { eager: false, nullable: false })
@JoinColumn({ name: 'roleId' })
role: Role;
}
After that, I ran commands to generate the migration scripts and create tables in the database.
As I didn’t use any custom naming strategy, the generated constraints are cryptic and non-readable, like the ones below.
-
FK_0b8c60cc29663fa5b9fb108edd7
for theroleId
of theUser
table -
IDX_b852abd9e268a63287bc815aab
for the index over theRole
table’sname
Customizing TypeORM Naming Strategy
But, as I said earlier, we can overcome it. All we need to do is to extend the DefaultNamingStrategy and implement NamingStrategyInterface. BTW, those two links will navigate to the TypeORM’s GitHub repository.
Let’s see the skeleton class first.
import { DefaultNamingStrategy, NamingStrategyInterface } from 'typeorm';
export class DatabaseNamingStrategy extends DefaultNamingStrategy implements NamingStrategyInterface {}
My goal is to override primaryKeyName
, foreignKeyName
, uniqueConstraintName
, indexName
methods one by one.
The simplest one is the primary key naming strategy. Each table can have only one primary key. But, we need to remember that a primary key can be composite. So, my pattern here is, PK_<table-name>_<column-1-name><column-2-name><...>
. Here the prefix PK
, the table name, and the columns are separated by the underscore syntax.
export class DatabaseNamingStrategy extends DefaultNamingStrategy implements NamingStrategyInterface {
primaryKeyName(tableOrName: Table | string, columnNames: string[]): string {
return `PK_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
}
private joinColumns(columnNames: string[]): string {
return columnNames.join('_');
}
}
We can design a unique key naming strategy like the primary keys. Let’s do it quickly.
uniqueConstraintName(tableOrName: Table | string, columnNames: string[]): string {
return `UQ_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
}
Foreign keys naming strategy is a bit tricky as it involves multiple tables. To tackle this, I plan to attach each table’s name before the group of columns' names, separated by the underscore. That means it would be like FK_<referencing-table-name>_<referencing-column-name>_<referenced-table-name>_<referenced-column-name>
.
Let’s jump into the codes.
foreignKeyName(
referencingTableOrName: Table | string,
referencingColumnNames: string[],
referencedTablePath?: string,
referencedColumnNames?: string[],
): string {
const referencingTableName = this.getTableName(referencingTableOrName);
const referencingReferencedGroup = referencingColumnNames.map((referencingColumn, index) => {
return `${referencingTableName}_${referencingColumn}_${referencedTablePath}_${referencedColumnNames[index]}`;
});
return `FK_${referencingReferencedGroup.join('_')}`;
}
Finally, it is time to design the index key naming strategy. It is almost the same as the primary key naming strategy. And here it is:
indexName(tableOrName: Table | string, columnNames: string[]): string {
return `IDX_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
}
The most challenging work has been accomplished. All we need to do is to pass the naming strategy to the TypeORM’s data source like the one below.
import { DatabaseNamingStrategy } from '@/common/persistence/DatabaseNamingStrategy';
export default new DataSource({
// other properties
namingStrategy: new DatabaseNamingStrategy(),
} as DataSourceOptions);
If we drop the DB (it is a proof of concept project, so we’re free to drop it!) and re-run the migration command, we will see our reflected naming strategy for all DB keys.
Congrats, everyone. We made it!
Partial Index
PostgreSQL has a concept of partial index. And I added a partial index over the User
table’s reference
column. It is, at the same a unique constraint. The idea is a user can be archived, and the reference
of the archived user might be allocated to a new user. That is why I created the partial index.
@Index(['reference'], {
where: `"deletedAt" IS NULL`,
unique: true,
})
Below SQL query was generated by the TypeORM.
CREATE UNIQUE INDEX "IDX_User_reference" ON "User" ("reference")
WHERE "deletedAt" IS NULL;
The problem is IDX_User_reference
would give us a false impression that the reference
is a global unique constraint, but it is not true. The same reference could belong to both an archived user and an active user simultaneously, depending on the value of the deletedAt
column.
In my opinion, attaching the WHERE
clause with the index name makes it more readable. Therefore, I modified the indexName
like the one below to consider the conditions.
indexName(tableOrName: Table | string, columnNames: string[], where?: string): string {
let indexName = `IDX_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
if (where) {
const suffix = this.getPartialIndexNameSuffix(tableOrName, columnNames, where);
indexName = `${indexName}_${suffix}`;
}
return indexName;
}
private getPartialIndexNameSuffix(tableOrName: Table | string, columnNames: string[], where: string): string {
const whereClauseMap: Record<string, string> = {
'"deletedAt" IS NULL': `deletedAt_IS_NULL`,
};
if (whereClauseMap[where]) {
return `WHERE_${whereClauseMap[where]}`;
}
const generatedIndexName = super.indexName(tableOrName, columnNames, where);
const { 1: hash } = generatedIndexName.split('IDX_');
return `WHERE_${hash}`;
}
The generated query is like the one below after applying the new strategy, which is our expected one.
CREATE UNIQUE INDEX "IDX_User_reference_WHERE_deletedAt_IS_NULL" ON "User" ("reference")
WHERE "deletedAt" IS NULL;
Showtime! Look at the attachment below to view the names of all the constraints.
Caption: Database keys
This DB has only one partial index, so my naming strategy method considers only a single condition, and it is "deletedAt" IS NULL
. For every other case, it will use the default naming strategy. For example, if there is a table named Product
and we want to apply index over the price
column only if its price is more than 50, instead of returning a readable name like IDX_Product_price_WHERE_GREATER_THAN_50
, it will, unfortunately, return us IDX_Product_price_WHERE_ac51a34467871db246f40ee2be
.
The summary here is we need to treat partial index case by case. It is difficult to make a generic solution.
Limitations
The finished DatabaseNamingStrategy
class can be found in the GitHub repository.
If you carefully check the codes, you will see I used only the first 63
characters of the generated names. The reason is that PostgreSQL does not allow a constraint with more than 63 characters. The details are in their official documentation.
Therefore, the SQL query below generates the foreign key FK_WebhookConfiguration_integrationConfigurationId_IntegrationC
.
ALTER TABLE "WebhookConfiguration"
ADD CONSTRAINT "FK_WebhookConfiguration_integrationConfigurationId_IntegrationConfiguration_id" FOREIGN KEY ("integrationConfigurationId") REFERENCES "IntegrationConfiguration" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION;
This is also an exception, like the previously mentioned index name on the price column. Therefore, it needs to be handled in a specific way beyond the generic convention I designed.
Before saying Goodbye
You are welcome to clone the POC project and run it locally. I have added unit tests so that it is easier to get the idea by playing with it.
We should write automated integration tests to ensure whenever there is a change in the DB, it follows the custom naming strategy. Also, we can drop the DB or re-create keys after deleting it for a brand-new project, but there is no way to do it in a live project. So, we need to rename the existing constraints to ensure it follows the given strategy. That is why automated tests for all keys of all tables are essential.
Stay in touch! I will publish two more parts to cover those two scenarios.
Thank you for taking the time to read through this, and I hope you found it insightful and enjoyable!
NB: The cover image is taken from one of my favourite books The Manga Guide to Databases.
Posted on August 6, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.