From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Add standard collation UNICODE |
Date: | 2023-03-10 21:24:27 |
Message-ID: | a806bf1edc08639bc58c6a5ade725049a4f61398.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers |
On Fri, 2023-03-10 at 11:52 -0800, Jonathan S. Katz wrote:
>
> The capitals should be first.
That is not true in a lot of natural language locales, whether libc or
ICU. The following return true for me (collencoding UTF-8):
select 'abc' collate "en_US" < 'ABC' collate "en_US";
select 'abc' collate "fr_FR" < 'ABC' collate "fr_FR";
select 'abc' collate "de_DE" < 'ABC' collate "de_DE";
select 'abc' collate "de_AT" < 'ABC' collate "de_AT";
select 'abc' collate "es_ES" < 'ABC' collate "es_ES";
select 'abc' collate "en-US-x-icu" < 'ABC' collate "en-US-x-icu";
select 'abc' collate "fr-CA-x-icu" < 'ABC' collate "fr-CA-x-icu";
select 'abc' collate "ja-JP-x-icu" < 'ABC' collate "ja-JP-x-icu";
select 'abc' collate "tr-TR-x-icu" < 'ABC' collate "tr-TR-x-icu";
There are some cases that return false, as well:
select 'abc' collate "ja_JP" < 'ABC' collate "ja_JP";
select 'abc' collate "fr_CA" < 'ABC' collate "fr_CA";
select 'abc' collate "en-US-u-va-posix-x-icu" <
'ABC' collate "en-US-u-va-posix-x-icu";
The cases where it's false appear to be more common in libc locales,
but most libc locales that I tested still sort lowercase first.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2023-03-11 13:31:00 | Re: pgsql: Add standard collation UNICODE |
Previous Message | Jonathan S. Katz | 2023-03-10 19:52:22 | Re: pgsql: Add standard collation UNICODE |