Re: Why do indexes and sorts use the database collation?

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why do indexes and sorts use the database collation?
Date: 2023-11-15 00:13:49
Message-ID: 486f0bb29684e31cb4496a6b9b73cc8ab69e2cc2.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2023-11-15 at 00:52 +0100, Matthias van de Meent wrote:
> That doesn't really answer the question for me. Why would you have a
> primary key that has different collation rules (which include
> equality
> rules)

The equality rules for all deterministic collations are the same: if
the bytes are identical, the values are considered equal; and if the
bytes are not identical, the values are considered unequal.

That's the basis for this entire thread. The "C" collation provides the
same equality semantics as every other deterministic collation, but
with better performance and lower risk. (As long as you don't actually
need range scans or path keys from the index.)

See varstr_cmp() or varstrfastcmp_locale(). Those functions first check
for identical bytes and return 0 if so. If the bytes aren't equal, it
passes it to the collation provider, but if the collation provider
returns 0, we do a final memcmp() to break the tie. You can also see
this in hashtext(), where for deterministic collations it just calls
hash_any() on the bytes.

None of this works for non-deterministic collations (e.g. case
insensitive), but that would be easy to block where necessary.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-11-15 00:15:29 Re: lazy_scan_heap() should release lock on buffer before vacuuming FSM
Previous Message Matthias van de Meent 2023-11-14 23:52:19 Re: Why do indexes and sorts use the database collation?