Quick Links

Re: Pre-proposal: unicode normalized text

From:	Peter Eisentraut <peter(at)eisentraut(dot)org>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Pre-proposal: unicode normalized text
Date:	2023-10-11 06:51:27
Message-ID:	3af7e977-660d-4161-85fe-d5f4a205aa3e@eisentraut.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 10.10.23 16:02, Robert Haas wrote:
> On Tue, Oct 10, 2023 at 2:44 AM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>> Can you restate what this is supposed to be for? This thread appears to
>> have morphed from "let's normalize everything" to "let's check for
>> unassigned code points", but I'm not sure what we are aiming for now.
>
> Jeff can say what he wants it for, but one obvious application would
> be to have the ability to add a CHECK constraint that forbids
> inserting unassigned code points into your database, which would be
> useful if you're worried about forward-compatibility with collation
> definitions that might be extended to cover those code points in the
> future.

I don't see how this would really work in practice. Whether your data
has unassigned code points or not, when the collations are updated to
the next Unicode version, the collations will have a new version number,
and so you need to run the refresh procedure in any case.

In response to

Re: Pre-proposal: unicode normalized text at 2023-10-10 14:02:30 from Robert Haas

Responses

Re: Pre-proposal: unicode normalized text at 2023-10-11 07:53:39 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2023-10-11 06:56:13	Re: Pre-proposal: unicode normalized text
Previous Message	Zhijie Hou (Fujitsu)	2023-10-11 06:48:44	Add null termination to string received in parallel apply worker