February 28, 2024 in Django, Performance, Tutorial by Rakan Farhouda3 minutes
Learn how small Django ORM tweaks cut API response times from 4 seconds to 330ms.
As your data grows, so do the queries under the hood. In this case study, I’ll walk through how we identified an N+1 query problem in a Django ViewSet and fixed it in two lines. By the end, you’ll know exactly how to apply select_related and prefetch_related for a 10× speedup.
We had a Django app with the following scale:
Listing courses via a simple ViewSet was taking 4 seconds to respond. To uncover why, let’s examine the data flow:
In models.py, our tables looked like this:
class Course(models.Model):
name = models.CharField(max_length=255)
subject = models.CharField(max_length=100)
level = models.CharField(max_length=50)
instructor = models.ForeignKey('Instructor', on_delete=models.CASCADE)
class Instructor(models.Model):
name = models.CharField(max_length=100)
areas_of_expertise = models.ManyToManyField('Tag')
class Tag(models.Model):
name = models.CharField(max_length=50)ForeignKey).When we fetched courses, Django loaded:
That pattern (1 + 5 000 + many) made the endpoint painfully slow.
To send this information as JSON, we nested serializers in serializers.py:
class TagSerializer(DynamicFieldsSerializer):
class Meta:
model = Tag
fields = ["name"]
class InstructorSerializer(DynamicFieldsSerializer):
#Uses TagSerializer to include tag data
areas_of_expertise = TagSerializer(many=True, read_only=True)
class Meta:
model = Instructor
fields = [
"id",
"name",
"areas_of_expertise",
]
class CourseSerializer(DynamicFieldsSerializer):
instructor = InstructorSerializer(read_only=True)
class Meta:
model = Course
fields = "__all__"TagSerializer returns each tag’s name.InstructorSerializer includes a list of tag names (areas_of_expertise).CourseSerializer nests InstructorSerializer so the API response for a course has instructor details and their tags.Our first views.py looked like this:
class CourseViewSet(DynamicFieldsModelViewSet):
queryset = Course.objects.all()
serializer_class = CourseSerializer
pagination_class = NoneBecause we used Course.objects.all(), Django fetched courses first and deferred related lookups. When serializing each course, Django ran a separate query for its instructor and then extra queries for tags. With 5 000 courses, that meant ∼5 000 instructor queries + additional tag queries—hence the ~4 000 ms response time.
Instead of loading related objects one-by-one, we can tell Django to grab them in bulk:
class CourseViewSet(DynamicFieldsModelViewSet):
queryset = (
Course.objects
.select_related("instructor")
.prefetch_related("instructor__areas_of_expertise")
)
serializer_class = CourseSerializer
pagination_class = Noneselect_related("instructor") performs a SQL JOIN to fetch each course’s instructor in the same query.prefetch_related("instructor__areas_of_expertise") does one extra query to load all tags for every instructor, then stitches them into each instructor object in Python.With these two methods, our /courses/ endpoint dropped from ~4 000 ms to ~330 ms—a 10× improvement.
django.db.connection.queries helps here.select_related for ForeignKey or OneToOneField: pulls related data via SQL JOIN.prefetch_related for ManyToMany or reverse relations: does a separate query and merges results in Python.By understanding how Django constructs query sets and using these two ORM calls, you can keep your API endpoints lightning-fast even as your database scales. 🚀