From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Pre-proposal: unicode normalized text |
Date: | 2023-10-11 07:53:39 |
Message-ID: | 2e4a7fe660757ac2f0885e3a571279e690963c5c.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, 2023-10-11 at 08:51 +0200, Peter Eisentraut wrote:
> I don't see how this would really work in practice. Whether your
> data
> has unassigned code points or not, when the collations are updated to
> the next Unicode version, the collations will have a new version
> number,
> and so you need to run the refresh procedure in any case.
Even with a version number, we don't provide a great reresh procedure
or document how it should be done. In practice, avoiding unassigned
code points might mitigate some kinds of problems, especially for glibc
which has a very coarse version number.
In any case, a CHECK constraint to avoid unassigned code points has
utility to be forward-compatible with normalization, and also might
just be a good sanity check.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-10-11 08:37:42 | Re: False "pg_serial": apparent wraparound” in logs |
Previous Message | David Rowley | 2023-10-11 07:50:41 | Re: Problem, partition pruning for prepared statement with IS NULL clause. |