Re: Pre-proposal: unicode normalized text

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Pre-proposal: unicode normalized text
Date: 2023-10-17 16:32:18
Message-ID: dfeff43884f7c3da50e32fc93cb2383255aa2e18.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2023-10-17 at 17:07 +0200, Daniel Verite wrote:
> There's a problem in the fact that the set of assigned code points is
> expanding with every Unicode release, which happens about every year.
>
> If we had this option in Postgres 11 released in 2018 it would use
> Unicode 11, and in 2023 this feature would reject thousands of code
> points that have been assigned since then.

That wouldn't be good for everyone, but might it be good for some
users?

We already expose normalization functions. If users are depending on
normalization, and they have unassigned code points in their system,
that will break when we update Unicode. By restricting themselves to
assigned code points, normalization is guaranteed to be forward-
compatible.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-10-17 16:33:09 Re: Fix output of zero privileges in psql
Previous Message Robert Haas 2023-10-17 16:21:00 Re: run pgindent on a regular basis / scripted manner