Re: Update Unicode data to Unicode 16.0.0

From: Jeremy Schneider <schneider(at)ardentperf(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Update Unicode data to Unicode 16.0.0
Date: 2025-03-15 06:54:41
Message-ID: 20250314235441.4413aedf@ardentperf.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 07 Mar 2025 13:11:18 -0800
Jeff Davis <pgsql(at)j-davis(dot)com> wrote:

> On Wed, 2025-03-05 at 20:43 -0600, Nathan Bossart wrote:
> > I see.  Do we provide any suggested next steps for users to assess
> > the
> > potentially-affected relations?
>
> I don't know exactly where we should document it, but I've attached a
> SQL file that demonstrates what can happen for a PG17->PG18 upgrade,
> assuming that we've updated Unicode to 16.0.0 in PG18.
>
> The change in Unicode that I'm focusing on is the addition of U+A7DC,
> which is unassigned in Unicode 15.1 and assigned in Unicode 16, which
> lowercases to U+019B. The examples assume that the user is using
> unassigned code points in PG17/Unicode15.1 and the PG_C_UTF8
> collation.

It seems the consensus is to update unicode in core... FWIW, I'm still
in favor of leaving it alone because ICU is there for when I need
up-to-date unicode versions.

From my perspective, the whole point of the builtin collation was to
one option that avoids these problems that come with updating both ICU
and glibc.

So I guess the main point of the builtin provider just that it's faster
than ICU?

-Jeremy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2025-03-15 07:26:11 Re: Update Unicode data to Unicode 16.0.0
Previous Message Michael Paquier 2025-03-15 06:29:24 Re: Allow default \watch interval in psql to be configured