By default, Django adds the auto-incrementing primary key to each model when we do not specify a custom primary key field.
id = models.AutoField(primary_key=True)
It is beneficial most of the time since the ID is generated by the database and automatically increases according to the available values, so we do not need to worry about setting the value by ourselves and having colliding IDs.
Internal vs. External Primary Key
But sometimes we would like to have another unique but not auto-incrementing value as the identifier, serving as a “pseudo primary key” and hiding the actual ID in the database. This is favorable especially in SaaS applications:
- Using a non-auto-incrementing ID can disguise the actual total count of records in the table, such as the number of users, products, etc.
- The custom primary key can have any format or data type other than just plain, positive integers. It can be a text field with any alphanumeric characters, a serializer number in a generally acknowledged format, a hashed or obfuscated result from other fields, etc.
- It is also editable after the default ID is generated on creation. For instance, we can change the ID format later to support more complicated use cases.
- The customers can also customize the ID, such as changing the unique username to a more user-friendly name without affecting the original primary key
So we can have a public-facing primary key in any format we like while keeping the default database primary key for internal use. In this way, all database relationship will not be affected by the external one, and no migrations are needed when the former one changes.
Implementation with Django model
A Django model with the custom primary key will look like this:
class User(models.Model):
id = models.UUIDField(
primary_key=True,
default=default=uuid.uuid4,
)
username = models.TextField(
unique=True,
)
def save(self, *args, **kwargs):
super().save(*args, **kwargs)
if not self.username:
self.username = make_default_username(self.pk)
super().save(update_fields=['username'])
The id
field is the default primary key of the table (by setting primary_key=True
). This value is auto-generated and does not change once the record is created. You may omit this field in the model definition and Django will fallback to use the default AutoField
as primary key
The username
field is our custom “pseudo primary key”, which is just a free-text field so any formats of usernames are allowed here. The only constraint is unique=True
and that’s all to serve our purpose — a unique identifier for us to look up a record! We can change it anytime without affecting all related models.
The final puzzle to make this approach works is in the save
method. We customize the save
method so a default username is assigned as a placeholder on the creation of this record — if the user does not provide a customized name beforehand.
To breakdown the save
method:
- It first saves the instance normally and let the database fills in the primary key
id
for us. This step is necessary if we are usingAutoField
. Django has no idea about the current count of the auto-incrementing primary key, so it has to run theINSERT INTO
SQL and obtain the actualid
returned by the database.
(Note that having N-th records in the table does not guarantee the next valid ID isN+1
. Inserting a collided ID by wild guess will result in integrity error.) - Next, it checks whether our pseudo primary
username
exists or not (whether the user provides a customized name on creation). If not, generate a random one withmake_default_username
.
This makes sure we don’t leave the user record with blankusername
. Sinceusername
has to be unique, this prohibits us from creating another user record. - Lastly, save the model instance again with the new
username
but only update theusername
field.
We cannot callself.save()
as it will lead to an infinite loop. We cannot callsuper().save()
withoutupdate_fields
either, which will be treated as creating a new instance again with the same primary keyid
and leads toIntegrityError
.
That’s it! Now we can enjoy the benefit of using this custom primary key:
# assume id is the default AutoField instead of UUIDField# create user without an initial username
user = User.objects.create()
print(user.id) # 1
print(user.username) # a default username generated# create user with an initial user name
user = User.objects.create(username='sherlock')
print(user.id) # 2
print(user.username) # 'sherlock'# we can expose the username to public and hide the real id# e.g. the view function of GET /user/:username
def user_detail_view(request, username=None):
user = User.objects.get(username=username)
# ...
Migration from existing models
So what if we need to add this “pseudo primary key” to an existing model with some data already, or this key is merely a model @property
but we want to turn it into a database field. Let’s consider a Product
model:
class Product(models.Model):
# id is the Django default AutoField
category = models.ForeignKey(Category)
name = models.TextField()
launch_date = models.DateTimeField() @property
def code(self):
return f'{self.category.name}-{self.name}'
- Add a new field
_code = models.TextField(blank=True)
- Make
.code
property “multiplex”return self._code or f'{self.category.name}-{self.name}'
So all codes that useproduct.code
are not affected. - Customize
.save()
to set_code
as described above. Now, new products will have_code
filled in and.code
property handles the lookup for us. - Make a migration script to back-fill
_code
for existing products - Set
_code
toblank=False
after the migration since all products have_code
now - Rename
_code
tocode
and remove the.code
property
And we are done! All codes that use product.code
are not affected and you can now filter products by code now 🎉