Quick Links

Re: general purpose array_sort

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Junwang Zhao <zhjwpku(at)gmail(dot)com>
Cc:	jian he <jian(dot)universality(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject:	Re: general purpose array_sort
Date:	2024-11-08 00:51:57
Message-ID:	Zy1grbOLp6c02-1b@paquier.xyz
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Nov 07, 2024 at 10:06:04PM +0800, Junwang Zhao wrote:
> I'm afraid this can not be achieved in my current implementation, a simple
> case is:
>
> SELECT array_sort('{foo,bar,null,CCC,Abc,bbc}'::text[]);
> {Abc,bar,bbc,CCC,foo,NULL}
> SELECT array_sort('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C");
> {Abc,CCC,bar,bbc,foo,NULL}
>
> SELECT array_sort(a COLLATE "C") FROM (VALUES
> ('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C"),
> ('{foo,bar,null,CCC,Abc,bbc}'::text[])) v(a);
> {Abc,CCC,bar,bbc,foo,NULL}
> {Abc,CCC,bar,bbc,foo,NULL}
>
> Maybe add some documents to specify this?

So, if I use that:
CREATE COLLATION case_sensitive (provider = icu, locale = '');
=# SELECT array_sort('{Abc,CCC,bar,bbc,foo,NULL}'::text[]
COLLATE "case_sensitive");
array_sort
----------------------------
{Abc,bar,bbc,CCC,foo,NULL}
(1 row)
=# SELECT array_sort('{Abc,CCC,bar,bbc,foo,NULL}'::text[]
COLLATE "C");
array_sort
----------------------------
{Abc,CCC,bar,bbc,foo,NULL}
(1 row)

What takes priority is the collation defined with the array_sort,
which is fine:
=# SELECT array_sort(a collate "case_sensitive") FROM
(VALUES ('{foo,bar,null,CCC,Abc,bbc}'::text[]),
('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C" )) v(a);
array_sort
----------------------------
{Abc,bar,bbc,CCC,foo,NULL}
{Abc,bar,bbc,CCC,foo,NULL}
(2 rows)
=# SELECT array_sort(a collate "C") FROM
(VALUES ('{foo,bar,null,CCC,Abc,bbc}'::text[]),
('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "case_sensitive" )) v(a);
array_sort
----------------------------
{Abc,CCC,bar,bbc,foo,NULL}
{Abc,CCC,bar,bbc,foo,NULL}
(2 rows)

The case where the collation is defined in the set of values is a bit
more troubling to me, as it depends on what the values want to be
applied, still that's OK because the collation applied is the one
coming from the set of values:
=# SELECT array_sort(a) FROM
(VALUES ('{foo,bar,null,CCC,Abc,bbc}'::text[]),
('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "case_sensitive" )) v(a);
array_sort
----------------------------
{Abc,bar,bbc,CCC,foo,NULL}
{Abc,bar,bbc,CCC,foo,NULL}
(2 rows)
=# SELECT array_sort(a) FROM
(VALUES ('{foo,bar,null,CCC,Abc,bbc}'::text[]),
('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C" )) v(a);
array_sort
----------------------------
{Abc,CCC,bar,bbc,foo,NULL}
{Abc,CCC,bar,bbc,foo,NULL}
(2 rows)

I am wondering if there are more fancy cases where the saved cache
could force a state that would lead to puzzling results, say with
different collations that should be applied. I'd recommend to
research that more, to reflect that in the docs and to add tests that
show what we should expect in these cases within 0001 because this
new function is mimicking in the context of a function execution
multiple query clauses where restrictions are applied when analyzing
the query, close to the parser.

For example, UNION and UNION ALL require a common collation when
processing a set of expressions related to them, which would be OK.
Perhaps I lack some imagination to be able to break things.
--
Michael

In response to

Re: general purpose array_sort at 2024-11-07 14:06:04 from Junwang Zhao

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andy Fan	2024-11-08 01:10:11	Re: Deleting older versions in unique indexes to avoid page splits
Previous Message	Peter Geoghegan	2024-11-08 00:46:53	Re: Deleting older versions in unique indexes to avoid page splits