Re: Built-in CTYPE provider

From: Noah Misch <noah(at)leadboat(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Jeremy Schneider <schneider(at)ardentperf(dot)com>
Subject: Re: Built-in CTYPE provider
Date: 2024-07-26 11:29:58
Message-ID: 20240726112958.06.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 24, 2024 at 08:19:13AM -0700, Noah Misch wrote:
> On Wed, Jul 17, 2024 at 03:03:26PM -0700, Noah Misch wrote:
> > vote one way or the other on the question in
> > https://postgr.es/m/20240706195129.fd@rfd.leadboat.com?
>
> I'll keep the voting open for another 24 hours from now
> or 36 hours after the last vote, whichever comes last.

I count 4.5 or 5 votes for "okay" and 2 votes for "not okay". I've moved the
open item to "Non-bugs".

On Wed, Jul 17, 2024 at 11:06:43PM -0700, Jeff Davis wrote:
> You haven't established that any problem actually exists in version 17,
> and your arguments have been a moving target throughout this subthread.

I can understand that experience of yours. It wasn't my intent to make a
moving target. To be candid, I entered the thread with no doubts that you'd
agree with the problem. When you and Tom instead shared a different view, I
switched to pursuing the votes to recognize the problem. (Voting then held
that pg_c_utf8 is okay as-is.)

On Thu, Jul 18, 2024 at 09:52:44AM -0700, Jeff Davis wrote:
> On Thu, 2024-07-18 at 07:00 -0700, Noah Misch wrote:
> > Given all the messages on this thread, if the feature remains in
> > PostgreSQL, I
> > advise you to be ready to tolerate PostgreSQL "routinely updating the
> > built-in
> > provider to adopt any changes that Unicode makes".
>
> You mean messages from me, like:
>
> * "I have no intention force a Unicode update" [1]
> * "While nothing needs to be changed for 17, I agree that we may need
> to be careful in future releases not to break things." [2]
> * "...you are right that we may need to freeze Unicode updates or be
> more precise about versioning..." [2]
> * "If you are proposing that Unicode updates should not be performed
> if they affect the results of any IMMUTABLE function...I am neither
> endorsing nor opposing..." [3]
>
> ?

Those, plus all the other messages.

On Fri, Jul 19, 2024 at 08:50:41AM -0700, Jeff Davis wrote:
> Consider:
>
> a. Some user creates an expression index on NORMALIZE(); vs.
> b. Some user chooses the builtin "C.UTF-8" locale and creates a partial
> index with a predicate like "string ~ '[[:alpha:]]'" (or an expression
> index on LOWER())
>
> Both cases create a risk if we update Unicode in some future version.
> Why are you unconcerned about case (a), but highly concerned about case
> (b)?

I am not unconcerned about (a), but the v17 beta process gave an opportunity
to do something about (b) that it didn't give for (a). Also, I have never
handled a user report involving NORMALIZE(). I have handled user reports
around regexp index inconsistency, e.g. the one at
https://www.youtube.com/watch?v=kNH94tmpUus&t=1490s

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-07-26 11:33:41 Re: Conflict detection and logging in logical replication
Previous Message Yao Wang 2024-07-26 11:18:35 Re: 回复: An implementation of multi-key sort