Re: [18] Policy on IMMUTABLE functions and Unicode updates

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeremy Schneider <schneider(at)ardentperf(dot)com>
Subject: Re: [18] Policy on IMMUTABLE functions and Unicode updates
Date: 2024-07-22 14:26:37
Message-ID: bbc6813e-b82e-4315-8a95-cc9b1f63a118@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19.07.24 21:41, Jeff Davis wrote:
> On Fri, 2024-07-19 at 21:06 +0200, Laurenz Albe wrote:
>> Perhaps I should moderate my statement: if a change affects only a
>> newly
>> introduced code point (which is unlikely to be used in a database),
>> and we
>> think that the change is very important, we could consider applying
>> it.
>> But that should be carefully considered; I am against blindly
>> following the
>> changes in Unicode.
>
> That sounds reasonable.
>
> I propose that, going forward, we take more care with Unicode updates:
> assess the impact, provide time for comments, and consider possible
> mitigations. In other words, it would be reviewed like any other
> change.

I disagree with that. We should put ourselves into the position to
adopt new Unicode versions without fear. Similar to updates to time
zones, snowball, etc.

We can't be discussing the merits of the Unicode update every year.
That would be madness. How would we weigh each change against the
others? Some new character is introduced because it's the new currency
of some country; seems important. Some mobile phone platforms jumped
the gun and already use the character for the same purpose before it was
assigned; now the character is in databases but some function results
will change with the upgrade. How do we proceed?

Moreover, if we were to decide to not take a particular Unicode update,
that would then stop that process forever, because whatever the issue
was wouldn't go away with the next Unicode version.

Unless I missed something here, all the problem examples involve
unassigned code points that were later assigned. (Assigned code points
already have compatibility mechanisms, such as collation versions.) So
I would focus on that issue. We already have a mechanism to disallow
unassigned code points. So there is a tradeoff that users can make:
Disallow unassigned code points and avoid upgrade issues resulting from
them. Maybe that just needs to be documented more prominently.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alastair Turner 2024-07-22 14:31:50 Re: Send duration output to separate log files
Previous Message Alvaro Herrera 2024-07-22 14:11:55 Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin