From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Arnaud Perrier" <arnaud(dot)perrier(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #16570: Collation not working |
Date: | 2020-08-05 14:49:49 |
Message-ID: | 118c09c5-3334-4570-8644-7c5300fbc0f2@manitou-mail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Arnaud Perrier wrote:
> 5/ SELECT * FROM TBL ORDER BY TEXT1 COLLATE digitslast, TEXT2 COLLATE
> digitslast;
> 5/ KO = TEXT1 = digits after letters + TEXT2 = digits before letters
I'm seeing wrong results too, and I don't really understand what's going
on with this collation.
Here's a simpler test with the current dev branch (14devel)
and ICU 60.2-3ubuntu.
=# CREATE COLLATION digitslast (provider = icu, locale =
'en(at)colReorder=latn-digit');
=# select * from (values('1'),('a')) as list(x) order by x collate
digitslast;
x
---
a
1
The digit is sorted after the letter, OK.
But then why would an inequality test report the opposite?
=# select 'a' < '1' collate digitslast;
?column?
----------
f
I'm wondering whether there is a discrepancy between the binary
sort keys and results of direct comparisons by ucol_strcoll()
for this collation.
Maybe that could explain why the second level of ORDER BY
produces results that are inconsistent with the first level.
With icu_ext we can visualize the sort keys and see that key('a')
if lower than key('1').
=# select icu_sort_key('a' collate digitslast) as sort_key_a,
icu_sort_key('1' collate digitslast) as sort_key_1
sort_key_a | sort_key_1
--------------+--------------
\x0f01050105 | \x4901050105
But then a direct comparison says the opposite, just like the
'<' operator as above.
=# select icu_compare('a', '1', 'en(at)colReorder=latn-digit');
icu_compare
-------------
1
icu_compare() is a simple interface on top of ucol_strcollUTF8()
or ucol_strcoll(), and 1 as a result maps to UCOL_GREATER, so unless
I'm missing something, this is looking more like an ICU bug than a
Postgres bug.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Kort | 2020-08-05 16:39:20 | Re: Index not used without explicit typecast |
Previous Message | Tom Lane | 2020-08-05 14:32:22 | Re: Index not used without explicit typecast |