Django Save vs. Create
Some benefits of deferred persistence
Save
When writing some code using Django's ORM, I got asked if I had a personal vendetta against objects.create() and kept using .save() instead. After a quick laugh, I thought about some reasons why I do this and wanted to see where people stood on this. Here's been my experience and my pros/cons:
Create
def create_post(title, description, comments):
post = Post.objects.create(
title=title,
description=description
)
for comment in comments:
Comment.objects.create(post=post, **comment)
return post
So this is a very simple example I see in code review often. At first glance to an untrained eye, this is fine. Tests
are passing, life goes on. But what actually happens during create()
?
So, I dive into the django source to get the scoop, in this case by breakpoint() and stepping inside the function from a test. And I see the following:
def create(self, **kwargs):
"""
Create a new object with the given kwargs,
save it to the database,
and return the created object.
"""
obj = self.model(**kwargs)
self._for_write = True
obj.save(force_insert=True, using=self.db)
return obj
Wait a minute... is this just .save()? But instead of deferred by choice, it eagerly persists and creates? Why wouldn't we just use .save to begin with?
It was at this point my breakpoint entered the for loop, and I was bewildered as the debugger kept breaking for each loop. Cue a lightbulb moment that we were in fact iterating and calling a sql query per iteration. So, that meant n back and forths between the database and our web server process.
So heres how it looks with save:
Save
def create_post(title, description, comments):
post = Post.objects.create(title=title, description=description)
comms_to_save = []
for comment in comments:
comm = Comment(post=post, **comment)
comm.full_clean()
comms_to_save.append(comm)
Comment.objects.bulk_update(comms_to_save)
return post
So by using the object creation process, we can defer insertion till the end once we've cleaned the data, and in one bulk operation. Depending on how large comments is, you should check out the bulk_update args and kwargs a bit closer.
So in this case, using .save and object creation first actually future proofs the code a bit in case we add some looping later. Furthermore, objects.create just does our save anyways, except eagerly.
I'd also recommend heartily covering your tests, and sprinkling in some self.assertNumQueries over the database calls. Your future self will thank you when the inevitable performance regression shows up.