From: | Will Mortensen <will(at)extrahop(dot)com> |
---|---|
To: | pgsql-docs(at)lists(dot)postgresql(dot)org, Jeremy Schneider <schneider(at)ardentperf(dot)com> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Documenting more pitfalls of non-default collations? |
Date: | 2024-06-11 06:55:39 |
Message-ID: | CAMpnoC6sJ9=F6OiXJ4kKyrHUzwbWXjUKC3t-rL4FrDyhyiYf5A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
Hi there,
I mentioned to Jeremy at pgConf.dev that using non-default collations
in some SQL idioms can produce undesired results, and he asked me to
send an email. An example idiom is the way Django implements
case-insensitive comparisons using "upper(x) = upper(y)" [1][2][3] ,
which returns false if x = y but they have different collations that
produce different uppercase.
For example, assuming the default collation performs standard Unicode
case mapping:
# select upper('i') = upper('i' collate "tr-x-icu");
f
# select upper('é') = upper('é' collate "C");
f
Or with collations in DDL instead:
# create table t (
tr text collate "tr-x-icu",
c text collate "C"
);
CREATE TABLE
# insert into t values ('i', 'é');
INSERT 0 1
# select count(*) from t where upper('i') = upper(tr);
0
# select count(*) from t where upper('é') = upper(c);
0
This is expected, given a careful reading of the collation docs, but
it's not really highlighted in any of the examples--in each example
that doesn't produce an error, all of the collation-sensitive
functions/operators end up applying the same collation. Maybe there
should be an example that applies different collations in different
subexpressions, and/or a warning against constructions like "upper(x)
= upper(y)"?
[1] https://github.com/django/django/blame/stable/5.1.x/django/db/backends/postgresql/operations.py#L175
[2] https://github.com/django/django/blame/stable/5.1.x/django/db/backends/postgresql/base.py#L155
[3] https://code.djangoproject.com/ticket/32485
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-06-11 08:41:06 | Re: Ambiguous description on new columns |
Previous Message | Erwin Brandstetter | 2024-06-10 13:35:03 | Re: Clarify the ordering guarantees in combining queries (or lack thereof) |