Re: ICU for global collation

From: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Verite <daniel(at)manitou-mail(dot)org>
Subject: Re: ICU for global collation
Date: 2022-01-17 19:07:38
Message-ID: 3FA888A6-D818-4B11-8961-7498B61905C5@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10.01.22 12:49, Daniel Verite wrote:

> I think some users would want their db-wide ICU collation to be
> case/accent-insensitive.
...
> IIRC, that was the context for some questions where people were
> enquiring about db-wide ICU collations.

+1. There is the DEFAULT_COLLATION_OID, which is the cluster-level default collation, a.k.a. the "global collation", as distinct from the "db-wide" database-level default collation, which controls the default type of the collatable types within that database.

> With the current patch, it's not possible, AFAICS, because the user
> can't tell that the collation is non-deterministic. Presumably this
> would require another option to CREATE DATABASE and another
> column to store that bit of information.

On 1/11/22, 6:24 AM, "Peter Eisentraut" <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:

> Adding this would be easy, but since pattern matching currently does not
> support nondeterministic collations, if you make a global collation
> nondeterministic, a lot of system views, psql, pg_dump queries etc.
> would break, so it's not practical. I view this is an orthogonal
> project. Once we can support this without breaking system views etc.,
> then it's easy to enable with a new column in pg_database.

So this patch only enables the default cluster collation (DEFAULT_COLLATION_OID) to be a deterministic ICU collation, but doesn't extend the metadata in a way that would enable database collations to be ICU collations?

Waiting for the pattern matching problem to be solved in general before creating the metadata to support ICU collations everywhere will make it more difficult for members of the community to help solve the pattern matching problem.

What additional metadata changes would be required to enable an ICU collation to be specified at either the cluster-level or the database-level, even if new checks need to be added to disallow a nondeterministic collation to be specified at the cluster level for now?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-01-17 19:18:09 Re: Adding CI to our tree
Previous Message Robert Haas 2022-01-17 19:05:36 Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)