From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Bruce Momjian" <bruce(at)momjian(dot)us> |
Cc: | "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: First draft of PG 17 release notes |
Date: | 2024-05-17 13:42:44 |
Message-ID: | 13447ff6-15fd-4137-8339-f4fddda7eb11@manitou-mail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bruce Momjian wrote:
> I have committed the first draft of the PG 17 release notes; you can
> see the results here:
>
> https://momjian.us/pgsql_docs/release-17.html
About the changes in collations:
<quote>
Create a "builtin" collation provider similar to libc's C locale
(Jeff Davis)
It uses a "C" locale which is identical but independent of
libc, but it allows the use of non-"C" collations like "en_US"
and "C.UTF-8" with the "C" locale, which libc does not. MORE?
</quote>
The new builtin provider has two collations:
* ucs_basic which is 100% identical to "C". It was introduced
several versions ago and the v17 novelty is simply to change
its pg_collation.collprovider from 'c' to 'b'.
* pg_c_utf8 which sorts like "C" but is Unicode-aware for
the rest, which makes it quite different from "C".
It's also different from the other UTF-8 collations that could
be used up to v17 in that it does not depend on an external
library, making it free from the collation OS-upgrade risks.
The part that is concretely of interest to users is the introduction
of pg_c_utf8. As described in [1]:
<quote>
pg_c_utf8
This collation sorts by Unicode code point values rather than
natural language order. For the functions lower, initcap, and
upper, it uses Unicode simple case mapping. For pattern
matching (including regular expressions), it uses the POSIX
Compatible variant of Unicode Compatibility Properties. Behavior
is efficient and stable within a Postgres major version. This
collation is only available for encoding UTF8.
</quote>
I'd suggest that the relnote entry should be more like a condensed
version of that description, without mentioning en_US or C.UTF-8,
whose existence and semantics are OS-dependent, contrary to pg_c_utf8.
[1] https://www.postgresql.org/docs/devel/collation.html
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
From | Date | Subject | |
---|---|---|---|
Next Message | Christoph Berg | 2024-05-17 13:42:52 | Re: psql JSON output format |
Previous Message | Christoph Berg | 2024-05-17 13:24:28 | Re: psql: Allow editing query results with \gedit |