| From: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
|---|---|
| To: | Jeff Davis <pgsql(at)j-davis(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Built-in CTYPE provider |
| Date: | 2024-03-26 07:04:28 |
| Message-ID: | a8804ef9-fda6-4660-9f98-ecd1315f958c@eisentraut.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 21.03.24 01:13, Jeff Davis wrote:
> The v26 patch was not quite complete, so I didn't commit it yet.
> Attached v27-0001 and 0002.
>
> 0002 is necessary because otherwise lc_collate_is_c() short-circuits
> the version check in pg_newlocale_from_collation(). With 0002, the code
> is simpler and all paths go through pg_newlocale_from_collation(), and
> the version check happens even when lc_collate_is_c().
>
> But perhaps there was a reason the code was the way it was, so
> submitting for review in case I missed something.
>
>> 0005 and 0006 don't contain any test cases. So I guess they are
>> really
>> only usable via 0007. Is that understanding correct?
> 0005 is not a functional change, it's just a refactoring to use a
> callback, which is preparation for 0007.
>
>> Are there any test cases that illustrate the word boundary changes in
>> patch 0005? It might be useful to test those against Oracle as well.
> The tests include initcap('123abc') which is '123abc' in the PG_C_UTF8
> collation vs '123Abc' in PG_UNICODE_FAST.
>
> The reason for the latter behavior is that the Unicode Default Case
> Conversion algorithm for toTitlecase() advances to the next Cased
> character before mapping to titlecase, and digits are not Cased. ICU
> has a configurable adjustment, and defaults in a way that produces
> '123abc'.
>
> New rebased series attached.
The patch set v27 is ok with me, modulo (a) discussion about initcap
semantics, and (b) what collation to assign to ucs_basic, which can be
revisited later.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2024-03-26 07:14:46 | Re: Built-in CTYPE provider |
| Previous Message | Dean Rasheed | 2024-03-26 06:57:25 | Re: Functions to return random numbers in a given range |