Adding an Index in Django Without Locking Your Database Table

karthiknarayan

KARTHIK NARAYAN

Posted on July 29, 2024

Adding an Index in Django Without Locking Your Database Table

Adding an index to a table in Django is usually easy peasy. Just use makemigrations and migrate and you're done, right? But what if I told you that doing it the usual way locks your table until the index is done being created? Yikes! That's a problem, especially if your table is huge or your app is live.

The Usual Way (That Locks Your Table)

Normally, you'd do something like this to add an index:

from django.db import migrations, models

class Migration(migrations.Migration):

    dependencies = [
        ('api', '0004_alter_user_price'),
    ]

    operations = [
        migrations.AddIndex(
            model_name='user',
            index=models.Index(fields=['email'], name='api_user_email_a7eefd_idx'),
        ),
    ]
Enter fullscreen mode Exit fullscreen mode

When you run this migration, it generates SQL like this:

BEGIN;
CREATE INDEX "api_user_email_a7eefd_idx" ON "api_user" ("email");
COMMIT;
Enter fullscreen mode Exit fullscreen mode

This locks your table until the index is created. Not cool!

Nightmare Experience

Migrated a Production table in traditional way, which locked a table for 7 minutes 😒.

The Solution: Background Indexing

Lucky for us, modern databases like PostgreSQL, SQL Server, and Oracle can create indexes in the background, so they don't lock the table. But Django doesn't use this feature out of the box. No worries, though, we've got a trick: SeparateDatabaseAndState.

Step-by-Step Guide

  • Create an Traditional Migration File
   python manage.py makemigrations app_name
Enter fullscreen mode Exit fullscreen mode
  • Create an Empty Migration File

Run this command to create an empty migration file:

   python manage.py makemigrations --empty --name custom_migration_name app_name
Enter fullscreen mode Exit fullscreen mode
  • Add SeparateDatabaseAndState

Open the new migration file and add SeparateDatabaseAndState like this:

   from django.db import migrations, models

   class Migration(migrations.Migration):

       dependencies = [
           ('app_name', 'prev_migration_name'),
       ]

       operations = [
           migrations.SeparateDatabaseAndState(
               state_operations=[],
               database_operations=[],
           ),
       ]
Enter fullscreen mode Exit fullscreen mode
  • Copy Migration Operations and SQL From Traditional File

From migration file generated by Django, copy the operations and paste them in state_operations.
Use sqlmigrate command to generate SQL Statement for the traditional migration file.

   python manage.py sqlmigrate app_name migration_name
Enter fullscreen mode Exit fullscreen mode

Update new migration file and delete traditional migration file,

   from django.db import migrations, models

   class Migration(migrations.Migration):

       dependencies = [
           ('api', '0004_alter_user_price'),
       ]

       operations = [
           migrations.SeparateDatabaseAndState(
               state_operations=[
                   migrations.AddIndex(
                       model_name='user',
                       index=models.Index(fields=['email'], name='api_user_email_a7eefd_idx'),
                   ),
               ],
               database_operations=[
                   migrations.RunSQL(
                       sql="""
                       BEGIN;
                       CREATE INDEX "api_user_email_a7eefd_idx" ON "api_user" ("email");
                       COMMIT;
                       """
                   ),
               ],
           ),
       ]
Enter fullscreen mode Exit fullscreen mode
  • Update SQL with background indexing and Add Reverse SQL

Now, let's actually add the background indexing. Update the database_operations like this:

   from django.db import migrations, models

   class Migration(migrations.Migration):

       dependencies = [
           ('api', '0004_alter_user_price'),
       ]

       operations = [
           migrations.SeparateDatabaseAndState(
               state_operations=[
                   migrations.AddIndex(
                       model_name='user',
                       index=models.Index(fields=['email'], name='api_user_email_a7eefd_idx'),
                   ),
               ],
               database_operations=[
                   migrations.RunSQL(
                       sql="""
                       CREATE INDEX CONCURRENTLY "api_user_email_a7eefd_idx" ON "api_user" ("email");
                       """,
                       reverse_sql="DROP INDEX "api_user_email_a7eefd_idx",
                   ),
               ],
           ),
       ]
Enter fullscreen mode Exit fullscreen mode

The magic word here is CONCURRENTLY. It tells PostgreSQL to create the index without locking the table. If you're using SQL Server or Oracle, you'd use:

  • SQL Server: CREATE INDEX index_name ON table_name (column_name) WITH (ONLINE = ON);
  • Oracle: CREATE INDEX index_name ON table_name (column_name) ONLINE;

That's It!

By using SeparateDatabaseAndState, you can add indexes without locking your tables. No more downtime or slow queries during index creation. Your users (and your database) will thank you!


Documentation Reference.

Happy codingπŸ‘¨β€πŸ’»πŸ!

πŸ’– πŸ’ͺ πŸ™… 🚩
karthiknarayan
KARTHIK NARAYAN

Posted on July 29, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related