Unicode full case mapping: PG_UNICODE_FAST, and standard-compliant UCS_BASIC

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Unicode full case mapping: PG_UNICODE_FAST, and standard-compliant UCS_BASIC
Date: 2024-12-11 23:52:44
Message-ID: ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Right now, the UCS_BASIC simply uses the "C" locale.

The standard for LOWER() and UPPER() define the behavior in terms of
Unicode. The specific requirement that "ß" shall uppercase to "SS" also
seems to imply Full Case Mapping.

Attached is a series of patches to implement full case mapping as the
locale PG_UNICODE_FAST.

The last patch in the series also changes UCS_BASIC to use that locale,
bringing it into compliance with the standard. However, this will break
existing users of the UCS_BASIC collation, so we may not want this at
all. If we do want it, we should be clear that affected expression
indexes using UCS_BASIC should be REINDEXed, or that users should
change the collation to "C" to get the previous behavior.

While Postgres uses the collation to define the behavior of LOWER() and
UPPER(), the standard doesn't mention it, so the behavior of those
functions is independent of the collation (according to the standard).
It wouldn't make sense to force the standard behavior on all the
collations, but it could make sense for the standard-defined UCS_BASIC
collation.

Regards,
Jeff Davis

Attachment Content-Type Size
v1-0001-Refactor-case-mapping-into-provider-specific-file.patch text/x-patch 37.9 KB
v1-0002-Support-Unicode-full-case-mapping-and-conversion.patch text/x-patch 559.2 KB
v1-0003-Support-PG_UNICODE_FAST-locale-in-the-builtin-col.patch text/x-patch 18.7 KB
v1-0004-Change-UCS_BASIC-to-use-the-builtin-PG_UNICODE_FA.patch text/x-patch 1.7 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-12-11 23:53:27 Re: Add Pipelining support in psql
Previous Message Andres Freund 2024-12-11 23:50:25 Re: FileFallocate misbehaving on XFS