A journey to simplify debugging: Automate generating human-friendly database constraints using TypeORM

emtiajium

Emtiaj Hasan

Posted on August 6, 2023

A journey to simplify debugging: Automate generating human-friendly database constraints using TypeORM

One fine morning, I watched the observability service DataDog and noticed the database-related errors below.

  • insert or update on table "User" violates foreign key constraint "FK_eebe9258d224f7861dd124c1814"

  • duplicate key value violates unique constraint "UQ_b0f76d1d1fc4ef7f1e3b627574d"

  • duplicate key value violates unique constraint "PK_e03f3cb1b058a7463f8278a4f97"

Let’s try to analyze errors so that we can take action accordingly.

The first error message hints that we were trying to insert a user with an invalid value to a particular column, referencing another table. But we need to find out which column caused the error.

The Prefix UQ suggests that a row in the table already contains the same value we were trying to insert. Does it show for which column? Unfortunately, no.

The third message is the same as the unique constraint. It happened for the primary key. But it does not give a clue about the name of the primary column.

To get an elaborate idea, I turned to the database administrator tool, checked the entity-relationship diagram, and reviewed all the database constraints to decipher these error messages.

This process left me exasperated and seemed like a waste of time. But then a thought struck me – what if database constraints were more human-readable and provided meaningful information for easier debugging?

I started thinking about applying a convention on the DB keys.


Readable Naming Convention

I use PostgreSQL backed by TypeORM. The keys mentioned here were (auto)generated by the TypeORM. Although TypeORM uses a cryptic and weird naming convention, it provides a way to override the default naming strategy. That means I can use this feature.

I made a convention for myself. Each key should have a prefix, including the table name it belongs to and the column name it is created on. Let’s give an example of each type of DB key.

Primary Key

PK_User_id - Representing the primary key id in the User table.

Foreign Key

FK_User_roleId_Role_id - Signifying a foreign key roleId in the User table referencing the id column in the Role table.

Unique Key

UQ_Flashcard_userId_vocabularyId - The Flashcard table has a composite unique key on columns userId and vocabularyId.

Index Key

IDX_Product_code - Here, the Product table has an index over the column code.


Project setup

Before writing the codes to design a naming strategy, let’s do the basic setup.

I will use NestJS, PostgreSQL, and TypeORM as usual. I aim to design a simple DB with only three tables: Cohort, Role, and User.

ER diagram of the DB

Caption: ER diagram of the DB

I simply followed the TypeORM documentation and managed to create entity classes like the ones below.



@Entity('Cohort')
export class Cohort {
   @PrimaryGeneratedColumn('uuid')
   id: string;


   @Column({ type: 'varchar', nullable: false, unique: true })
   name: string;


   @OneToMany(() => User, (user) => user.role, { eager: false, cascade: false })
   users: User[];
}


Enter fullscreen mode Exit fullscreen mode


@Entity('Role')
@Index(['name'])
export class Role {
   @PrimaryGeneratedColumn('uuid')
   id: string;


   @Column({ type: 'varchar', nullable: false })
   name: string;


   @Column({ type: 'jsonb', nullable: false })
   permission: RolePermission;


   @OneToMany(() => User, (user) => user.role, { eager: false, cascade: false })
   users: User[];
}


Enter fullscreen mode Exit fullscreen mode


@Entity('User')
@Index(['reference'], {
   where: `"deletedAt" IS NULL`,
   unique: true,
})
export class User {
   @PrimaryGeneratedColumn('uuid')
   id: string;


   @Column({ type: 'varchar', nullable: false })
   name: string;


   @Column({ type: 'varchar', nullable: false })
   reference: string;


   @DeleteDateColumn({ type: 'timestamp with time zone', nullable: true })
   deletedAt: Date;


   @ManyToOne(() => Cohort, (cohort) => cohort.users, { eager: false, nullable: false })
   @JoinColumn({ name: 'cohortId' })
   cohort: Cohort;


   @ManyToOne(() => Role, (role) => role.users, { eager: false, nullable: false })
   @JoinColumn({ name: 'roleId' })
   role: Role;
}


Enter fullscreen mode Exit fullscreen mode

After that, I ran commands to generate the migration scripts and create tables in the database.

As I didn’t use any custom naming strategy, the generated constraints are cryptic and non-readable, like the ones below.

  • FK_0b8c60cc29663fa5b9fb108edd7 for the roleId of the User table
  • IDX_b852abd9e268a63287bc815aab for the index over the Role table’s name

Customizing TypeORM Naming Strategy

But, as I said earlier, we can overcome it. All we need to do is to extend the DefaultNamingStrategy and implement NamingStrategyInterface. BTW, those two links will navigate to the TypeORM’s GitHub repository.

Let’s see the skeleton class first.



import { DefaultNamingStrategy, NamingStrategyInterface } from 'typeorm';


export class DatabaseNamingStrategy extends DefaultNamingStrategy implements NamingStrategyInterface {}


Enter fullscreen mode Exit fullscreen mode

My goal is to override primaryKeyName, foreignKeyName, uniqueConstraintName, indexName methods one by one.

The simplest one is the primary key naming strategy. Each table can have only one primary key. But, we need to remember that a primary key can be composite. So, my pattern here is, PK_<table-name>_<column-1-name><column-2-name><...>. Here the prefix PK, the table name, and the columns are separated by the underscore syntax.



export class DatabaseNamingStrategy extends DefaultNamingStrategy implements NamingStrategyInterface {
   primaryKeyName(tableOrName: Table | string, columnNames: string[]): string {
       return `PK_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
   }


   private joinColumns(columnNames: string[]): string {
       return columnNames.join('_');
   }
}


Enter fullscreen mode Exit fullscreen mode

We can design a unique key naming strategy like the primary keys. Let’s do it quickly.



uniqueConstraintName(tableOrName: Table | string, columnNames: string[]): string {
   return `UQ_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
}


Enter fullscreen mode Exit fullscreen mode

Foreign keys naming strategy is a bit tricky as it involves multiple tables. To tackle this, I plan to attach each table’s name before the group of columns' names, separated by the underscore. That means it would be like FK_<referencing-table-name>_<referencing-column-name>_<referenced-table-name>_<referenced-column-name>.

Let’s jump into the codes.



foreignKeyName(
   referencingTableOrName: Table | string,
   referencingColumnNames: string[],
   referencedTablePath?: string,
   referencedColumnNames?: string[],
): string {
   const referencingTableName = this.getTableName(referencingTableOrName);


   const referencingReferencedGroup = referencingColumnNames.map((referencingColumn, index) => {
       return `${referencingTableName}_${referencingColumn}_${referencedTablePath}_${referencedColumnNames[index]}`;
   });


   return `FK_${referencingReferencedGroup.join('_')}`;
}


Enter fullscreen mode Exit fullscreen mode

Finally, it is time to design the index key naming strategy. It is almost the same as the primary key naming strategy. And here it is:



indexName(tableOrName: Table | string, columnNames: string[]): string {
   return `IDX_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;
}


Enter fullscreen mode Exit fullscreen mode

The most challenging work has been accomplished. All we need to do is to pass the naming strategy to the TypeORM’s data source like the one below.



import { DatabaseNamingStrategy } from '@/common/persistence/DatabaseNamingStrategy';

export default new DataSource({
   // other properties
   namingStrategy: new DatabaseNamingStrategy(),
} as DataSourceOptions);


Enter fullscreen mode Exit fullscreen mode

If we drop the DB (it is a proof of concept project, so we’re free to drop it!) and re-run the migration command, we will see our reflected naming strategy for all DB keys.

Congrats, everyone. We made it!


Partial Index

PostgreSQL has a concept of partial index. And I added a partial index over the User table’s reference column. It is, at the same a unique constraint. The idea is a user can be archived, and the reference of the archived user might be allocated to a new user. That is why I created the partial index.



@Index(['reference'], {
   where: `"deletedAt" IS NULL`,
   unique: true,
})


Enter fullscreen mode Exit fullscreen mode

Below SQL query was generated by the TypeORM.



CREATE UNIQUE INDEX "IDX_User_reference" ON "User" ("reference")
   WHERE "deletedAt" IS NULL;


Enter fullscreen mode Exit fullscreen mode

The problem is IDX_User_reference would give us a false impression that the reference is a global unique constraint, but it is not true. The same reference could belong to both an archived user and an active user simultaneously, depending on the value of the deletedAt column.

In my opinion, attaching the WHERE clause with the index name makes it more readable. Therefore, I modified the indexName like the one below to consider the conditions.



indexName(tableOrName: Table | string, columnNames: string[], where?: string): string {
   let indexName = `IDX_${this.getTableName(tableOrName)}_${this.joinColumns(columnNames)}`;


   if (where) {
       const suffix = this.getPartialIndexNameSuffix(tableOrName, columnNames, where);
       indexName = `${indexName}_${suffix}`;
   }


   return indexName;
}

private getPartialIndexNameSuffix(tableOrName: Table | string, columnNames: string[], where: string): string {
   const whereClauseMap: Record<string, string> = {
       '"deletedAt" IS NULL': `deletedAt_IS_NULL`,
   };


   if (whereClauseMap[where]) {
       return `WHERE_${whereClauseMap[where]}`;
   }


   const generatedIndexName = super.indexName(tableOrName, columnNames, where);
   const { 1: hash } = generatedIndexName.split('IDX_');


   return `WHERE_${hash}`;
}


Enter fullscreen mode Exit fullscreen mode

The generated query is like the one below after applying the new strategy, which is our expected one.



CREATE UNIQUE INDEX "IDX_User_reference_WHERE_deletedAt_IS_NULL" ON "User" ("reference")
   WHERE "deletedAt" IS NULL;



Enter fullscreen mode Exit fullscreen mode

Showtime! Look at the attachment below to view the names of all the constraints.

Database keys

Caption: Database keys


This DB has only one partial index, so my naming strategy method considers only a single condition, and it is "deletedAt" IS NULL. For every other case, it will use the default naming strategy. For example, if there is a table named Product and we want to apply index over the price column only if its price is more than 50, instead of returning a readable name like IDX_Product_price_WHERE_GREATER_THAN_50, it will, unfortunately, return us IDX_Product_price_WHERE_ac51a34467871db246f40ee2be.

The summary here is we need to treat partial index case by case. It is difficult to make a generic solution.


Limitations

The finished DatabaseNamingStrategy class can be found in the GitHub repository.

If you carefully check the codes, you will see I used only the first 63 characters of the generated names. The reason is that PostgreSQL does not allow a constraint with more than 63 characters. The details are in their official documentation.

Therefore, the SQL query below generates the foreign key FK_WebhookConfiguration_integrationConfigurationId_IntegrationC.



ALTER TABLE "WebhookConfiguration"
   ADD CONSTRAINT "FK_WebhookConfiguration_integrationConfigurationId_IntegrationConfiguration_id" FOREIGN KEY ("integrationConfigurationId") REFERENCES "IntegrationConfiguration" ("id") ON DELETE NO ACTION ON UPDATE NO ACTION;


Enter fullscreen mode Exit fullscreen mode

This is also an exception, like the previously mentioned index name on the price column. Therefore, it needs to be handled in a specific way beyond the generic convention I designed.


Before saying Goodbye

You are welcome to clone the POC project and run it locally. I have added unit tests so that it is easier to get the idea by playing with it.

We should write automated integration tests to ensure whenever there is a change in the DB, it follows the custom naming strategy. Also, we can drop the DB or re-create keys after deleting it for a brand-new project, but there is no way to do it in a live project. So, we need to rename the existing constraints to ensure it follows the given strategy. That is why automated tests for all keys of all tables are essential.

Stay in touch! I will publish two more parts to cover those two scenarios.

Thank you for taking the time to read through this, and I hope you found it insightful and enjoyable!

NB: The cover image is taken from one of my favourite books The Manga Guide to Databases.

💖 💪 🙅 🚩
emtiajium
Emtiaj Hasan

Posted on August 6, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related