How to optimize your Django application for more speed with ORM?

when people use web applications, they want to see fast-loaded pages and efficient performance. As a result, speed has become one of the top metrics that people use to determine the design value of a web application.

So here is more teacher and example why is perform is matter?

According to Kissmetrics, 47% of consumers expect a web page to load in 2 seconds or less.
This same study also found that 40% of people abandon websites that take more than 3 seconds to load.
A just one-second slowdown in page load time cost Amazon company $1.6 billion in sales

So how can the Django application load faster? Here is some ways we can optimize …

Fast of all, We need an example of the Django model

class Product(models.Model):
    name = models.CharField(max_length=250)]
    description = models.TextField(blank=True)

    category = models.ForeignKey(Category, on_delete=models.SET_NULL, blank=False, null=False)
    tags = models.ManyToManyField(Tag)
    
    created_at = models.DateTimeField(auto_now_add=True)

    def __str__(self):
        return self.name


class Photo(models.Model):
    image = models.ImageField(upload_to='product')
    product = models.ForeignKey(Product, related_name="photos" on_delete=models.CASCADE)

#1 Optimize SQL query with select_related and prefetch_related

Optimization SQL query / Prefetching most common technique in Django. It’s everywhere in Django. There is two option to prefetch in Django select_related and prefetch_related.

Product.objects.select_releated('category')
...

Product.objects.prefetch_related('photos')
...

Select_related: Select_related working for OneToManyField/ForeignKey, OneToOneField. Category is an OneToMany/ForeignKey Field. So imagine a query set of products. but you have to access the Category of each product. So you looping through the Product. you are doing the Product instance Category. if you are not doing with select_related then you are doing every single Product query set. you have 100 products so you have a query. but you are doing with select_related, django will prefetch all the category of every single product instance.

Prefetch_releated: Prefetch_related is working exactly the same way as select_related except OneToMany/ForeignkeyField and OnoToOneField. And Prefetch also works with reverse ForeignKey. So there are scenarios, Photo has ForeignKey of Product. So Product has reverse ForeignKey relation with Photo as related_name=’phtotos’

Checkout Django Documentation Click Here

#2 Bulk_create

Bulk_Create is another common. Imagine you have a list of Product instances but there are not in the database yet. So you have to save them. So you could that you iterate all the Product, you know for loop with Product and save. So that will insert every single Product with one to one.

So Django gives you bulk_create. So you pass all instances/lists of products that haven’t save in the database yet. Then Django executes a single query of bytes size. Byte size depends on the database(SQLITE_MAX_EXPR_DEPTH=1000,). you can also pass the Byte size parameter. In my opinion, you can create up to 300 records without any problem.

## run test create_product_v1 in 7.621 seconds
## run test create_product_v2 in 0.981 seconds

# Bad
def create_product_v1(products)
    for product in products:
        product.save()

# Good
def create_product_v2(products):
    Product.objects.bulk_create(products)

Might be different from every other machine. but it will have huge different load times.

#3 Using _id in Foreign Key fields

Imagine you have a list of products and you want to know quickly if the product has a category. The category is nullable and blankly field, So it can be null. So you are going to loop if the product has a category. but if you only know the product has a category, you don’t necessarily gonna use the category yet. if you wanna know the product has a category. So If you do product has a category, it will make a query from product table to category table and fetch it but you don’t really need it at the moment. os it’s really inefficient. if you have hundreds of products then if make hundreds of category queries.

So normally in Django, you define the field “name” in the case of the Product model I define name=models.CharField. Infield definition, it will create a name column in the database table. But ForeignKey Field is different, my field call category in the database column is category_id of the product table. The category is represented more like a virtual field. So this is why instances automatically get a foreign key field _id attribute. which you can then use quickly to check if this instance has a category or none.

How foreign keys store in the database

ATTACHMENT DETAILS how-foeignkey-field-represent-in-database

Check product has Category

def has_category(product):
    return bool(product.category_id)

#4 Save and Update

Normally when your product.save(), Django will automatically grab all the columns in product table and do an update everything. and you get a lot of fields and a lot of columns it’s gonna do that every single time. It can be quite efficient, Here is Django give you safe update_fields, which you give the fields and django will only update the field, Here is an example

# Bad
def set_new_category_v1(product, category):
    product.category = category
    product.save()
# Good
def set_new_category_v2(product, category):
    product.category = category
    product.save(update_fields=['category])

# Good
def set_new_category_v3(product, category):
    # Call queryset_update to avoid tiggering
    # save signals. It will only run on SQL UPDATE
    Product.objects.fitler(pk=product.pk).update(cattegory=category)

Set_new_category_v1: So I call save on the first version, Django will not just save category it will set name that all other fields are going to be updated.

Set_new_category_v2: But version 2 only sets the category. if the model has 50 fields but the model is only saved category fields.

Set_new_category_v3: There is a third option. The third option is you can use that object query set have one updates method which you can call with the fields you want to update and the values and that will be more efficient than saying because you bypass all the signal

#5 Fetch only what you need

Normally we use Product.objects.all(). and that would also give us all the fields for the product. But normally you don’t necessarily need all the data for the product, you just want to price the name of the product. you don’t need when was the last modified you don’t need the category any of that So Django give you Only method
Example:

# Ran test print_product_name_v1 in 0.1781524542
# Ran test print_product_name_v2 in 0.1715216745

def print_product_name_v1():
    for product in Product.objects.all():
        print(product.name)


def print_product_name_v2():
    for product in Product.objects.only('name'):
        print(product.name)

So here you got two functions, first one print_product_name_v1 function fetches all the data of this product model but al the other hand print_product_name_v2 fetches only the product name field.

But Django gives us two more options

# Run test print_product_name_v3 in 0.07765642537
# Run test print_product_name_v4 in 0.06642537344

def print_product_name_v3():
    for data in Product.objects.values('name):
        # Values: it's return list of dictionary
        print(data['name])

def print_product_name_v4():
    for name in Product.objects.values_list('name', flat=True):
        # values_list: it's return list of Tuple 
        # but if you add flat=true, it's true list of string of name
        print(name)

Values: you pass in the field name, in this case, name. And it returns a list of each object dictionary. So it’s not a normal query set it’s a values query set. As you can see it’s significantly faster than the others (print_product_name_v1 or print_product_name_v2) are 0.17 and v3 is 0.7 also v4 is 0.06. So there is a big difference in performance

Values_list: Value list is a return list of tuple or an interval values list query set of tuples and this tuple includes the values for the name that you are pass in a parameter. we can tell Django we only one single value we can tell Django flat=true. and they return a flat list instead of a list of tuples with only one item name.

Actually, values and valus_list is the same at the database level and ORM Level

Iterator Method

with Open(product.csv, 'wb') as productFile:
   quertset = Product.objects.values_list(
        'pk', 'name', 'description', 'created_at'
    )
   csv_writer = csv.writer(productFile)
   csv_writer.writerow(['ID', 'Name', 'Description', 'Created At'])
   csv_writer.writerows(queryset.iterator())

So here is an example for values_list and iterator to export product. in case I wrote values_list (pk, name, description, crated_at) create CSV write pass in the header row and then iterator of the values of this queryset. So it’s only fetching those four fields and it’s using the iterator of a query set loading the entire thing in memory.

How to optimize your Django application for more speed with ORM?

So how can the Django application load faster? Here is some ways we can optimize …

#1 Optimize SQL query with select_related and prefetch_related

#2 Bulk_create

#3 Using _id in Foreign Key fields

#4 Save and Update

#5 Fetch only what you need

But Django gives us two more options

Iterator Method

Sajal Mia

Useful Links

Categories

Recent Posts

Understanding Webpack Tree Shaking in Angular: Optimizing Your Application

Insertion Sort Algorithm explaination

Selection Sort explaination

Newsletter Subscriptions

So how can the Django application load faster? Here is some ways we can optimize …

#1 Optimize SQL query with select_related and prefetch_related

#2 Bulk_create

#3 Using _id in Foreign Key fields

#4 Save and Update

#5 Fetch only what you need

But Django gives us two more options

Iterator Method

Related posts