Re: Built-in CTYPE provider

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jeremy Schneider <schneider(at)ardentperf(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Built-in CTYPE provider
Date: 2024-03-14 08:54:35
Message-ID: 4135cf11-206d-40ed-96c0-9363c1232379@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 14.03.24 09:08, Jeff Davis wrote:
> On Wed, 2024-03-13 at 00:44 -0700, Jeff Davis wrote:
>> New series attached. I plan to commit 0001 very soon.
>
> Committed the basic builtin provider, supporting only the "C" locale.

As you were committing this, I had another review of
v23-0001-Introduce-collation-provider-builtin.patch in progress. Some
of the things I found you have already addressed in what you committed.
Please check the remaining comments.

* doc/src/sgml/charset.sgml

I don't understand the purpose of this sentence:

"When using this locale, the behavior may depend on the database encoding."

* doc/src/sgml/ref/create_database.sgml

The new parameter builtin_locale is not documented.

* src/backend/commands/collationcmds.c

I think DefineCollation() should set collencoding = -1 for the
COLLPROVIDER_BUILTIN case. -1 stands for any encoding. Or at least
explain why not?

* src/backend/utils/adt/pg_locale.c

This part is a bit confusing:

+ cache_entry->collate_is_c = true;
+ cache_entry->ctype_is_c = (strcmp(colllocale, "C") == 0);

Is collate always C but ctype only sometimes? Does this anticipate
future patches in this series? Maybe in this patch it should always
be true?

* src/bin/initdb/initdb.c

+ printf(_(" --builtin-locale=LOCALE set builtin locale name
for new databases\n"));

Put in a line break so that the right "column" lines up.

This output should line up better:

The database cluster will be initialized with this locale configuration:
default collation provider: icu
default collation locale: en
LC_COLLATE: C
LC_CTYPE: C
...

Also, why are there two spaces after "provider: "?

Also we call these locale provider on input, why are they collation
providers on output? What is a "collation locale"?

* src/bin/pg_upgrade/t/002_pg_upgrade.pl

+if ($oldnode->pg_version >= '17devel')

This is weird. >= is a numeric comparison, so providing a string with
non-digits is misleading at best.

* src/test/icu/t/010_database.pl

-# Test that LOCALE works for ICU locales if LC_COLLATE and LC_CTYPE
-# are specified

Why remove this test?

+my ($ret, $stdout, $stderr) = $node1->psql('postgres',
+ q{CREATE DATABASE dbicu LOCALE_PROVIDER builtin LOCALE 'C' TEMPLATE
dbicu}
+);

Change the name of the new database to be different from the name of
the template database.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-03-14 09:31:58 Re: POC, WIP: OR-clause support for indexes
Previous Message Daniel Gustafsson 2024-03-14 08:32:55 Re: Fix the synopsis of pg_md5_hash