Custom Django model field based on default Primary Key

Marco
4 min readAug 28, 2020

--

Königssee, Bavaria, Germany — Dec 2019

By default, Django adds the auto-incrementing primary key to each model when we do not specify a custom primary key field.

id = models.AutoField(primary_key=True)

It is beneficial most of the time since the ID is generated by the database and automatically increases according to the available values, so we do not need to worry about setting the value by ourselves and having colliding IDs.

Internal vs. External Primary Key

But sometimes we would like to have another unique but not auto-incrementing value as the identifier, serving as a “pseudo primary key” and hiding the actual ID in the database. This is favorable especially in SaaS applications:

  • Using a non-auto-incrementing ID can disguise the actual total count of records in the table, such as the number of users, products, etc.
  • The custom primary key can have any format or data type other than just plain, positive integers. It can be a text field with any alphanumeric characters, a serializer number in a generally acknowledged format, a hashed or obfuscated result from other fields, etc.
  • It is also editable after the default ID is generated on creation. For instance, we can change the ID format later to support more complicated use cases.
  • The customers can also customize the ID, such as changing the unique username to a more user-friendly name without affecting the original primary key

So we can have a public-facing primary key in any format we like while keeping the default database primary key for internal use. In this way, all database relationship will not be affected by the external one, and no migrations are needed when the former one changes.

Implementation with Django model

A Django model with the custom primary key will look like this:

class User(models.Model):
id = models.UUIDField(
primary_key=True,
default=default=uuid.uuid4,
)
username = models.TextField(
unique=True,
)

def save(self, *args, **kwargs):
super().save(*args, **kwargs)
if not self.username:
self.username = make_default_username(self.pk)
super().save(update_fields=['username'])

The id field is the default primary key of the table (by setting primary_key=True). This value is auto-generated and does not change once the record is created. You may omit this field in the model definition and Django will fallback to use the default AutoField as primary key

The username field is our custom “pseudo primary key”, which is just a free-text field so any formats of usernames are allowed here. The only constraint is unique=True and that’s all to serve our purpose — a unique identifier for us to look up a record! We can change it anytime without affecting all related models.

The final puzzle to make this approach works is in the save method. We customize the save method so a default username is assigned as a placeholder on the creation of this record — if the user does not provide a customized name beforehand.

To breakdown the save method:

  1. It first saves the instance normally and let the database fills in the primary key id for us. This step is necessary if we are using AutoField. Django has no idea about the current count of the auto-incrementing primary key, so it has to run the INSERT INTO SQL and obtain the actual id returned by the database.
    (Note that having N-th records in the table does not guarantee the next valid ID is N+1. Inserting a collided ID by wild guess will result in integrity error.)
  2. Next, it checks whether our pseudo primary username exists or not (whether the user provides a customized name on creation). If not, generate a random one with make_default_username.
    This makes sure we don’t leave the user record with blank username. Since username has to be unique, this prohibits us from creating another user record.
  3. Lastly, save the model instance again with the new username but only update the username field.
    We cannot call self.save() as it will lead to an infinite loop. We cannot call super().save() without update_fields either, which will be treated as creating a new instance again with the same primary key id and leads to IntegrityError.

That’s it! Now we can enjoy the benefit of using this custom primary key:

# assume id is the default AutoField instead of UUIDField# create user without an initial username
user = User.objects.create()
print(user.id) # 1
print(user.username) # a default username generated
# create user with an initial user name
user = User.objects.create(username='sherlock')
print(user.id) # 2
print(user.username) # 'sherlock'
# we can expose the username to public and hide the real id# e.g. the view function of GET /user/:username
def user_detail_view(request, username=None):
user = User.objects.get(username=username)
# ...

Migration from existing models

So what if we need to add this “pseudo primary key” to an existing model with some data already, or this key is merely a model @property but we want to turn it into a database field. Let’s consider a Product model:

class Product(models.Model):
# id is the Django default AutoField
category = models.ForeignKey(Category)
name = models.TextField()
launch_date = models.DateTimeField()
@property
def code(self):
return f'{self.category.name}-{self.name}'
  1. Add a new field _code = models.TextField(blank=True)
  2. Make .code property “multiplex”
    return self._code or f'{self.category.name}-{self.name}'
    So all codes that use product.code are not affected.
  3. Customize .save() to set _code as described above. Now, new products will have _code filled in and .code property handles the lookup for us.
  4. Make a migration script to back-fill _code for existing products
  5. Set _code to blank=False after the migration since all products have _code now
  6. Rename _code to code and remove the .code property

And we are done! All codes that use product.code are not affected and you can now filter products by code now 🎉

--

--

Marco
Marco

Written by Marco

Software Engineer | Hongkonger 🇭🇰 Passion for Craftsmanship