Re: First draft of PG 17 release notes

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Bruce Momjian" <bruce(at)momjian(dot)us>
Cc: "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: First draft of PG 17 release notes
Date: 2024-05-17 13:42:44
Message-ID: 13447ff6-15fd-4137-8339-f4fddda7eb11@manitou-mail.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:

> I have committed the first draft of the PG 17 release notes; you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html

About the changes in collations:

<quote>
Create a "builtin" collation provider similar to libc's C locale
(Jeff Davis)

It uses a "C" locale which is identical but independent of
libc, but it allows the use of non-"C" collations like "en_US"
and "C.UTF-8" with the "C" locale, which libc does not. MORE?
</quote>

The new builtin provider has two collations:
* ucs_basic which is 100% identical to "C". It was introduced
several versions ago and the v17 novelty is simply to change
its pg_collation.collprovider from 'c' to 'b'.

* pg_c_utf8 which sorts like "C" but is Unicode-aware for
the rest, which makes it quite different from "C".
It's also different from the other UTF-8 collations that could
be used up to v17 in that it does not depend on an external
library, making it free from the collation OS-upgrade risks.

The part that is concretely of interest to users is the introduction
of pg_c_utf8. As described in [1]:

<quote>
pg_c_utf8

This collation sorts by Unicode code point values rather than
natural language order. For the functions lower, initcap, and
upper, it uses Unicode simple case mapping. For pattern
matching (including regular expressions), it uses the POSIX
Compatible variant of Unicode Compatibility Properties. Behavior
is efficient and stable within a Postgres major version. This
collation is only available for encoding UTF8.
</quote>

I'd suggest that the relnote entry should be more like a condensed
version of that description, without mentioning en_US or C.UTF-8,
whose existence and semantics are OS-dependent, contrary to pg_c_utf8.

[1] https://www.postgresql.org/docs/devel/collation.html

Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2024-05-17 13:42:52 Re: psql JSON output format
Previous Message Christoph Berg 2024-05-17 13:24:28 Re: psql: Allow editing query results with \gedit