Re: Built-in CTYPE provider

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Built-in CTYPE provider
Date: 2024-03-26 07:04:28
Message-ID: a8804ef9-fda6-4660-9f98-ecd1315f958c@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 21.03.24 01:13, Jeff Davis wrote:
> The v26 patch was not quite complete, so I didn't commit it yet.
> Attached v27-0001 and 0002.
>
> 0002 is necessary because otherwise lc_collate_is_c() short-circuits
> the version check in pg_newlocale_from_collation(). With 0002, the code
> is simpler and all paths go through pg_newlocale_from_collation(), and
> the version check happens even when lc_collate_is_c().
>
> But perhaps there was a reason the code was the way it was, so
> submitting for review in case I missed something.
>
>> 0005 and 0006 don't contain any test cases.  So I guess they are
>> really
>> only usable via 0007.  Is that understanding correct?
> 0005 is not a functional change, it's just a refactoring to use a
> callback, which is preparation for 0007.
>
>> Are there any test cases that illustrate the word boundary changes in
>> patch 0005?  It might be useful to test those against Oracle as well.
> The tests include initcap('123abc') which is '123abc' in the PG_C_UTF8
> collation vs '123Abc' in PG_UNICODE_FAST.
>
> The reason for the latter behavior is that the Unicode Default Case
> Conversion algorithm for toTitlecase() advances to the next Cased
> character before mapping to titlecase, and digits are not Cased. ICU
> has a configurable adjustment, and defaults in a way that produces
> '123abc'.
>
> New rebased series attached.

The patch set v27 is ok with me, modulo (a) discussion about initcap
semantics, and (b) what collation to assign to ucs_basic, which can be
revisited later.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-03-26 07:14:46 Re: Built-in CTYPE provider
Previous Message Dean Rasheed 2024-03-26 06:57:25 Re: Functions to return random numbers in a given range