Re: BUG #17571: PostgreSQL 15b2 ICU collation defined at db level sorts differently in collate clause vs implicit

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Daniel Halsey <Daniel(dot)Halsey(at)acadis(dot)com>
Cc: "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17571: PostgreSQL 15b2 ICU collation defined at db level sorts differently in collate clause vs implicit
Date: 2022-08-04 17:26:09
Message-ID: 20220804172609.fqaqfh6yj7vql7ow@jrouhaud
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Aug 04, 2022 at 03:59:12PM +0000, Daniel Halsey wrote:
>
> This will be frustrating to work around, since it'll require injecting
> COLLATE sub-clauses for all order by clauses (or like clauses, if we
> re-define our columns to use a non-deterministic collation).
> Is support for non-deterministic collation at the db level on the roadmap?

No one is working on that at the moment as far as I know.

The problem is that we would need to support LIKE and regex for non
deterministic collation first, as those are use in system views.

> Is there a query to determine what the actual/effective collation settings
> are (per the underlying provider) for a given db (since
> pg_database.daticulocale x pg_catalog.pg_collation.collname doesn't
> necessarily return what's going to be used)?

I think there's no misunderstanding here, pg_database.daticulocale isn't
supposed to refer to a collation name in the database, it's the actual locale
string passed to ICU, ie. the same as pg_collation.colliculocale.

So you have to check what ICU will exactly be doing for a given locale string,
which is a bit troublesome as it tends to accept anything and fallback to its
"root" locale.

You can also use Daniel Vérité's excellent icu_ext extension at
https://github.com/dverite/icu_ext. This provides a lot of useful function,
including icu_collation_attributes(), which for the "und-sorttest-x-icu" locale
string returns:

=# SELECT * FROM icu_collation_attributes('und-sorttest-x-icu');
attribute | value
-------------+----------------------------------------------
displayname | Unknown language (SORTTEST, Private-Use=icu)
kn | false
kb | false
kk | false
ka | noignore
ks | level3
kf | false
kc | false
kv | punct
version | 153.112
(10 rows)

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2022-08-04 17:32:28 BUG #17572: Different behaviour in different versions of postgresql details as in email
Previous Message Tom Lane 2022-08-04 17:17:48 Re: BUG #17558: 15beta2: Endless loop with UNIQUE NULLS NOT DISTINCT and INSERT ... ON CONFLICT