Quoc-Hung Hoang
Posted on July 30, 2022
TL; DR
When you reference to a model field for read/write operations, let's use F() expression
- Help refer to model field directly in the database, no need to load it into Python memory -> save queries.
- Can help avoid race condition or dirty read.
- Need to refresh_from_db after query because Python only knows about SQL expression instead of actual result.
Bulk update
Suppose the government in your country raise tax rate by 5% which makes you have to raise your listing product price by 20%. What would your django query look like ?
class Product(models.Model):
name = models.TextField()
price = models.DecimalField()
in_stock = models.IntegerField(
help_text="Number of items available in inventory"
)
A naive implementation of updating multiple products may be like this:
products = Product.objects.all()
for product in products:
product.price *= 1.2
product.save()
In this case, you are doing SELECT price FROM product
then UPDATE product SET price = new_value WHERE condition
each record. It means 2 query (1 for READ and 1 for WRITE) for each object.
Think of it more carefully, we can realize that the new price is relative to the current price no matter what it is. Intuitively, we want to reference to price
field of Product
model when running update process.
And here it comes, F() expression. The Django official doc states:
An F() object represents the value of a model field, transformed value of a model field, or annotated column.
It makes it possible to refer to model field values and perform database operations using them without actually having to pull them out of the database into Python memory.
Let's try the problem with F()
and update()
queryset method
from django.db.models import F
Product.objects.update(price=F("price")*1.2)
Although the above query looks like a normal Python assignment of value to an instance attribute, in fact it is a SQL expression. This expression instruct database to multiply the price field in database by 120 percent.
New price value is based on current price value so we don't need to load it into Python memory. That's why F() comes into play.
Update a single object
Let's say you want to update in_stock
field after every order payment is completed.
A naive implementation may be like this:
def process_payment(product: Product):
with transaction.atomic():
payment = Payment.objects.create(product=product)
product.in_stock = product.in_stock - 1
product.save(update_fields=["in_stock"])
So what's the problem ?
Let's imagine there are multiple users trying to make orders for a product, the scenario looks like this:
Process 1 | Process 2 | in_stock |
---|---|---|
Select in_stock -> 5 |
5 | |
Select in_stock -> 5 |
5 | |
Update in_stock = 5-1 |
4 | |
Update in_stock = 5-1 |
4 |
In this case, two processes are updating product.in_stock
at the same time but in_stock
value just decrease by 1. That's incorrect.
The main issue is that you decrease in_stock
based on what you fetched, what if you give database an instruction to update in_stock
based on what is currently stored ?
def process_payment(product: Product):
with transaction.atomic():
payment = Payment.objects.create(product=product)
product.in_stock = F("in_stock") - 1
product.save(update_fields=["in_stock"]])
The difference between two approach is quite small but let's look at SQL generated by update commands:
The naive approach:
UPDATE product_product
SET in_stock = 4
WHERE id = 262;
This will decrease the quantity = 4 regardless of the current value of in_stock in database
The F() approach:
UPDATE product_product
SET in_stock = in_stock + 1
WHERE id = 262;
The quantity of product with id 262 will reduce by 1 and not set by a fixed value. This is how to use an F expression to solve the race condition problem.
Note
The F() object which is assign to model field persist after saving model instance and will be applied on each save()
so we need to refresh_from_db
to get the updated instace.
Try to read an instance without refreshing from database may lead to unexpected result:
In [12]: product = Product.objects.get(id=262)
In [13]: product.in_stock = F("in_stock") - 1
In [14]: product.save()
In [15]: product.in_stock
Out[15]: <CombinedExpression: F(in_stock) - Value(1)>
In [16]:
Summary
Throughout the article, we pointed out two use cases of F() expression
- Reduce the number of queries some operations require by getting the database, rather than Python, to do work.
- Avoid race condition when two process retrieve and update the same instance.
Posted on July 30, 2022
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.