Tips and tricks for optimizing the performance of Django ORM
Syed Mubtada Ali
Posted on May 3, 2023
Django is a popular web framework for building complex, data-driven applications, and its Object-Relational Mapping (ORM) system is a crucial component for interacting with databases. However, as your application grows in complexity and scale, it’s essential to ensure that your ORM queries are optimized for performance. In this article, we’ll explore some tips and tricks for optimizing the performance of Django ORM so that your application can handle larger volumes of data and operate more efficiently. These tips will help you get the most out of your database queries and keep your application running smoothly.
Only()
In the above code, we are only using two fields. Still, when Django ORM translates Python code into SQL, it retrieves all fields from the model as a result, even if some data is not necessary. To optimize the query and prevent unnecessary data retrieval, developers can utilize the .only() method and explicitly indicate the specific fields that are required.
Prefetch_related / select_related
When Django evaluates a QuerySet, the relationship fields are not included in the query, so when you access them, Django runs the DB again to get the values. Accessing a related field in a loop results in N+1 queries. You can check the queries executed on the database using tools like django-debug-toolbar and django-silk.
In the above code, the first query is executed on the option queryset, and then for every option, three queries are executed: question, division, and concern. To reduce queries, we can add question and division in select_related and concern in prefetch_related.
This code will execute only 2 queries on the database regardless of the number of options: one for options and the second for concerns. In the previous version, if there were 100 options, it would perform 301 (3N + 1) queries on the database.
One important thing to understand is that after adding select_related and prefetch_related, you cannot use ORM functions such as first(), *last(), or *order_by(), as these will run the queries again. If you want the first object of concern, use option.concerns.all()[0] instead of option.concerns.all(). Use Prefetch() only when you want additional filtering and order_by; otherwise, simple prefetch_related(“concern”) is enough.
The basic difference between select_related and prefetch_related is that select_related performs a join with each lookup and gets the results back in the same query, but it extends the “select” to include the column of all joined tables. On the other hand, prefetch_related performs a separate query for each table to be joined. It filters each of these tables with a WHERE IN clause. The select_related can be used for one-to-one and one-to-many relations if the related field exists in the table where the query is being executed. However, if the related field exists in the other table, we can use prefetch_related. It’s also used for many-to-many relations.
Queryset Evaluation and Caching
Take a look at the following simple example to understand ORM evaluation:
It might look like it performed two database queries, but it actually performed only a single query, which was executed on the last line, “len(active_staff)”.
Creating a queryset doesn’t perform any database activity, not even stacking the filters. A queryset result is only fetched from the database when asked by performing certain actions, like iteration, len, slicing, repr, list, etc. This is called evaluation.
Caching allows you to avoid making multiple database queries when reusing the same queryset. The first time you run a QuerySet, Django saves the results in a temporary storage place called cache. Then, whenever you run that same QuerySet again, Django will use the cached results instead of running another query to the database. This makes the process of retrieving data faster and more efficient.
In the above example, the same query is executed twice on the database. To avoid this problem, save the QuerySet and reuse it:
In_bulk()
This function will make your life easier if you need a dict map of the queryset. It allows you to retrieve a dictionary of objects from the database based on a list of primary key values or the defined key in the function argument.
Instead of:
Use this:
The code is much cleaner and also provides a performance boost by leveraging the lazy nature of Django ORM.
For those who are passionate about writing neat and organized code and haven’t had the chance to go through “Clean Code” by Robert C. Martin, this article provides a helpful summary of the book’s first half and key takeaways. It’s worth reading if you’re looking to enhance your coding skills and principles.
Lessons from “Clean Code”
Syed Mubtada Ali ・ May 3 ・ 9 min read
I hope that you have learned something fresh and insightful. If our interests align, please consider subscribing. You can also reach out to me on LinkedIn.
Thank you for your time and attention.
Posted on May 3, 2023
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.