Re: general purpose array_sort

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: general purpose array_sort
Date: 2024-11-05 01:12:58
Message-ID: ZylxGpROC3ILz1VX@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 04, 2024 at 03:16:35PM +0800, jian he wrote:
> drop table if exists t;
> CREATE TABLE t (a int[]);
> insert into t values ('{1,3}'),('{1,2,3}'),('{11}');
> insert into t values ('{{1,12}}'), ('{{4,3}}');
> SELECT array_sort(a) from t;
>
> In the above case,
> tuplesort_begin_datum needs the int type information and int[] type information.
> otherwise the cached TypeCacheEntry is being used to sort mult-dimension array,
> which will make the result false.

All these behaviors need more extensive testing.

This brings me an extra question around the caching. Would the
sorting be able to behave correctly when feeding to a single
array_sort() context array values that have multiple COLLATE clauses?
Or merge_collation_state() would be smart enough to make sure that
collation conflicts never happen to begin with? I am wondering if we
should worry about multiple VALUES, CTEs, or PL functions where
array_sort() could be fed into its cache values that lead to
unpredictible results for some values. This stuff should perhaps have
more testing around such behaviors, stressing what kind of
interactions we have between the sorting of multiple values and the
caching, in the context of a single array_sort() call.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-11-05 01:29:59 Re: Pgoutput not capturing the generated columns
Previous Message David Rowley 2024-11-05 01:10:48 Re: Speed up Hash Join by teaching ExprState about hashing