Using nondeterministic collations and LIKE in Django

Published 15 August, 2024.

In Django and PostgreSQL, the recommended way to create case-insensitive text field is to use a case-insensitive collation. This sounds tricky, though it’s really quite simple.

First, create a collation inside a migration:

"""Create a case-insensitive collation."""

from django.contrib.postgres.operations import CreateCollation
from django.db import migrations


class Migration(migrations.Migration):
    """Create a case-insensitive collation."""

    operations = (
        CreateCollation(
            "case_insensitive",
            provider="icu",
            locale="und-u-ks-level2",
            deterministic=False,
        ),
    )

Then you’re free to use that collation whenever you create a text field on a model:

class Dog(models.Model):
    name = models.TextField(
        db_collation="case_insensitive",
        help_text="Who's a good boy?",
    )

See, easy!

However, this collation will cause problems if you ever want to use __contains to filter a queryset:

>>> Dog.objects.filter(name__contains="Craig")

NotSupportedError: nondeterministic collations are not supported for LIKE

You can work around this by annotating your queryset with a deterministic, case-sensitive version of the field, and swapping __contains for __icontains:

>>> from django.db.models.functions import Collate
>>> Dog.objects.annotate(
        deterministic_name=Collate("name", "C")
    ).filter(
        deterministic_name__icontains="Craig"
    )

<QuerySet []>

Note that I’m using a collation named “C” here. I must admit that I’m no expert in collations, but “C” seems to be widely considered a default collation.

Also note that the queryset is empty, because Craig would be a weird name for a dog.