February 28, 2024 in Django, Performance, Tutorial by Rakan Farhouda3 minutes
Learn how small Django ORM tweaks cut API response times from 4 seconds to 330ms.
As your data grows, so do the queries under the hood. In this case study, I’ll walk through how we identified an N+1 query problem in a Django ViewSet and fixed it in two lines. By the end, you’ll know exactly how to apply select_related
and prefetch_related
for a 10× speedup.
We had a Django app with the following scale:
Listing courses via a simple ViewSet was taking 4 seconds to respond. To uncover why, let’s examine the data flow:
In models.py
, our tables looked like this:
class Course(models.Model):
name = models.CharField(max_length=255)
subject = models.CharField(max_length=100)
level = models.CharField(max_length=50)
instructor = models.ForeignKey('Instructor', on_delete=models.CASCADE)
class Instructor(models.Model):
name = models.CharField(max_length=100)
areas_of_expertise = models.ManyToManyField('Tag')
class Tag(models.Model):
name = models.CharField(max_length=50)
ForeignKey
).When we fetched courses, Django loaded:
That pattern (1 + 5 000 + many) made the endpoint painfully slow.
To send this information as JSON, we nested serializers in serializers.py
:
class TagSerializer(DynamicFieldsSerializer):
class Meta:
model = Tag
fields = ["name"]
class InstructorSerializer(DynamicFieldsSerializer):
#Uses TagSerializer to include tag data
areas_of_expertise = TagSerializer(many=True, read_only=True)
class Meta:
model = Instructor
fields = [
"id",
"name",
"areas_of_expertise",
]
class CourseSerializer(DynamicFieldsSerializer):
instructor = InstructorSerializer(read_only=True)
class Meta:
model = Course
fields = "__all__"
TagSerializer
returns each tag’s name
.InstructorSerializer
includes a list of tag names (areas_of_expertise
).CourseSerializer
nests InstructorSerializer
so the API response for a course has instructor details and their tags.Our first views.py
looked like this:
class CourseViewSet(DynamicFieldsModelViewSet):
queryset = Course.objects.all()
serializer_class = CourseSerializer
pagination_class = None
Because we used Course.objects.all()
, Django fetched courses first and deferred related lookups. When serializing each course, Django ran a separate query for its instructor and then extra queries for tags. With 5 000 courses, that meant ∼5 000 instructor queries + additional tag queries—hence the ~4 000 ms response time.
Instead of loading related objects one-by-one, we can tell Django to grab them in bulk:
class CourseViewSet(DynamicFieldsModelViewSet):
queryset = (
Course.objects
.select_related("instructor")
.prefetch_related("instructor__areas_of_expertise")
)
serializer_class = CourseSerializer
pagination_class = None
select_related("instructor")
performs a SQL JOIN
to fetch each course’s instructor in the same query.prefetch_related("instructor__areas_of_expertise")
does one extra query to load all tags for every instructor, then stitches them into each instructor object in Python.With these two methods, our /courses/
endpoint dropped from ~4 000 ms to ~330 ms—a 10× improvement.
django.db.connection.queries
helps here.select_related
for ForeignKey
or OneToOneField
: pulls related data via SQL JOIN
.prefetch_related
for ManyToMany
or reverse relations: does a separate query and merges results in Python.By understanding how Django constructs query sets and using these two ORM calls, you can keep your API endpoints lightning-fast even as your database scales. 🚀