Advanced Django Queries you should know
Usman Abass
Posted on May 17, 2022
When working on large applications, especially those that involve high database computation like financial technology products with banking or payments generally, the need to write optimised and efficient queries becomes very crucial to the success of the business. This article covers some common advanced queries you should know as a Django developer.
PREREQUISITES
In order to understand this article, you should have some basic to mid-level knowledge of the following (although you can still read if you're just curious about how database queries work):
- Python
- Object Oriented Programming
- Object Relational Mapping (ORM) and Django models
- Querysets
For demonstration sake, the following models will be used:
The models used are Footballer, Team and Competition. A footballer belongs to one team, and a team has many footballers (ManytoOne relationship). Also, a footballer can participate in many competitions and a competition can have many footballers participating in it (ManyToMany relationship).
Note: The relationship can work between Team and Competition. This is just for demonstration sake.
For convenience, the database has been prepopulated with dummy data as shown below:
Now that we've gotten that out of the way, let's move on to learning about some of these queries.
"F" EXPRESSIONS
The query below gets all footballers who have scored at least 20 goals or more in a season.
The number "20" in the above query is a constant. However, what if you want to base your search on another field in the model (or table for simplicity)? For instance, You want to get all players who have assisted more than they've scored in the season. That is where "F" expressions come in. An instance of F() serves as a reference to a field in the model. An example is shown below:
By using "F" expressions, you are able to compare a field in the model with another field.
"Q" FOR COMPLEX QUERIES
The query below gets all footballers that are above 30 and have scored over 20 goals in a season:
By default, keyword arguments in the filter() method are evaluated as "ANDs". So the above query is equivalent to writing the following SQL Query:
" SELECT * FROM Fotballer WHERE age > 30 AND season_goals > 20 ".
But what if you want to perform complex queries with OR instead of AND? Suppose you want to get all players that are above 30 OR have scored over 20 goals in a season. See the query below:
Notice how Mohammed Salah has been added to the queryset result because he is below 30. This is a simple example of, but not limited to what you can use "Q" for. You can also perform complex and queries by using & instead of | that was used above. It is important to note that you can use ~ alongside "Q" to perform negation. See the query below:
The query above selects all players that are NOT up to 30, but hit double figures in either goals scored or assists given. This is the SQL equivalent of:
" SELECT * FROM Footballer WHERE NOT age > 30 AND (season_goals >= 10 OR season_assists > 10) ".
ANNOTATION
Sometimes you want to perform additional calculations or computations on results from a query. Although this can be done after obtaining the results of the query using python, it's best to allow the database to handle the computations for performance reasons. For instance, if want to get the goal contributions (which would be a sum of both the goals scored and assists given) for each footballer, this would be an excellent use case for annotations. See the query below:
As shown above, the goals and assists are summed up in the database and returned as the field name specified (goal_contributions). This is more optimal than fetching all footballers and then having to loop through the queryset to perform the sum.
AGGREGATION
While annotations perform computations on each object in the queryset, aggregations perform computations on the ENTIRE queryset. The result of the aggregate() method is a dictionary with each argument in the method serving as the dictionary's key. For instance, if you want to get a total of all goals scored by all footballers, you use the query below:
Notice that the result of the query is a dictionary with the key being the method's argument. "Sum" in the query is called an aggregation function and there are several of those. You should also observe that annotations are therefore aggregations on EACH object in the queryset.
select_related()
This can be an incredible performance booster when used correctly. The select_related() method returns a queryset that fetches the related model by performing a JOIN on the related field. Consider the two queries below:
As shown, the first query hits the database twice while the second query hits the database just once. This seemingly little difference can be a huge performance booster when dealing with very large datasets.
prefetch_related()
Similar to the select_related() method, the prefetch_related() is also used to mitigate the overhead resulting from joining multiple tables. The prefetch_related() method does a separate lookup for each relationship and performs the JOIN in python. This means that the selected_related() method can only be used for Foreign Key and OneToOne relationships.
The code below demonstrates the use of prefetch_related():
One thing to note about prefetch_related() however, is that performing a subquery such as filter on the resulting queryset would lead to the database being queried again and defeating the entire purpose of prefetch_related(). Read more on this at:
https://docs.djangoproject.com/en/3.2/ref/models/querysets/#prefetch-related
Thank you for reading!!!
Resources:
https://docs.djangoproject.com/en/3.2/ref/models/querysets/
https://docs.djangoproject.com/en/3.2/topics/db/queries/
Posted on May 17, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.
Related
November 30, 2024