Tips and tricks for optimizing the performance of Django ORM

mubtadaali

Syed Mubtada Ali

Posted on May 3, 2023

Tips and tricks for optimizing the performance of Django ORM

Django is a popular web framework for building complex, data-driven applications, and its Object-Relational Mapping (ORM) system is a crucial component for interacting with databases. However, as your application grows in complexity and scale, it’s essential to ensure that your ORM queries are optimized for performance. In this article, we’ll explore some tips and tricks for optimizing the performance of Django ORM so that your application can handle larger volumes of data and operate more efficiently. These tips will help you get the most out of your database queries and keep your application running smoothly.

Only()

without only()

In the above code, we are only using two fields. Still, when Django ORM translates Python code into SQL, it retrieves all fields from the model as a result, even if some data is not necessary. To optimize the query and prevent unnecessary data retrieval, developers can utilize the .only() method and explicitly indicate the specific fields that are required.

with only()

Prefetch_related / select_related

When Django evaluates a QuerySet, the relationship fields are not included in the query, so when you access them, Django runs the DB again to get the values. Accessing a related field in a loop results in N+1 queries. You can check the queries executed on the database using tools like django-debug-toolbar and django-silk.

without select related

In the above code, the first query is executed on the option queryset, and then for every option, three queries are executed: question, division, and concern. To reduce queries, we can add question and division in select_related and concern in prefetch_related.

with select related

This code will execute only 2 queries on the database regardless of the number of options: one for options and the second for concerns. In the previous version, if there were 100 options, it would perform 301 (3N + 1) queries on the database.

One important thing to understand is that after adding select_related and prefetch_related, you cannot use ORM functions such as first(), *last(), or *order_by(), as these will run the queries again. If you want the first object of concern, use option.concerns.all()[0] instead of option.concerns.all(). Use Prefetch() only when you want additional filtering and order_by; otherwise, simple prefetch_related(“concern”) is enough.

The basic difference between select_related and prefetch_related is that select_related performs a join with each lookup and gets the results back in the same query, but it extends the “select” to include the column of all joined tables. On the other hand, prefetch_related performs a separate query for each table to be joined. It filters each of these tables with a WHERE IN clause. The select_related can be used for one-to-one and one-to-many relations if the related field exists in the table where the query is being executed. However, if the related field exists in the other table, we can use prefetch_related. It’s also used for many-to-many relations.

Queryset Evaluation and Caching

Take a look at the following simple example to understand ORM evaluation:

Evaluation

It might look like it performed two database queries, but it actually performed only a single query, which was executed on the last line, “len(active_staff)”.

Creating a queryset doesn’t perform any database activity, not even stacking the filters. A queryset result is only fetched from the database when asked by performing certain actions, like iteration, len, slicing, repr, list, etc. This is called evaluation.

Caching allows you to avoid making multiple database queries when reusing the same queryset. The first time you run a QuerySet, Django saves the results in a temporary storage place called cache. Then, whenever you run that same QuerySet again, Django will use the cached results instead of running another query to the database. This makes the process of retrieving data faster and more efficient.

without caching

In the above example, the same query is executed twice on the database. To avoid this problem, save the QuerySet and reuse it:

with cache

In_bulk()

This function will make your life easier if you need a dict map of the queryset. It allows you to retrieve a dictionary of objects from the database based on a list of primary key values or the defined key in the function argument.

Instead of:

without in-bulk

Use this:

with in-bulk

The code is much cleaner and also provides a performance boost by leveraging the lazy nature of Django ORM.


For those who are passionate about writing neat and organized code and haven’t had the chance to go through “Clean Code” by Robert C. Martin, this article provides a helpful summary of the book’s first half and key takeaways. It’s worth reading if you’re looking to enhance your coding skills and principles.


I hope that you have learned something fresh and insightful. If our interests align, please consider subscribing. You can also reach out to me on LinkedIn.

Thank you for your time and attention.

💖 💪 🙅 🚩
mubtadaali
Syed Mubtada Ali

Posted on May 3, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related