From: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: general purpose array_sort |
Date: | 2024-11-07 14:06:04 |
Message-ID: | CAEG8a3+QA+QVnhnATPiSH=QBui+Y5Z-j0NmecSL6yrN8E1dXzA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Nov 5, 2024 at 9:13 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Nov 04, 2024 at 03:16:35PM +0800, jian he wrote:
> > drop table if exists t;
> > CREATE TABLE t (a int[]);
> > insert into t values ('{1,3}'),('{1,2,3}'),('{11}');
> > insert into t values ('{{1,12}}'), ('{{4,3}}');
> > SELECT array_sort(a) from t;
> >
> > In the above case,
> > tuplesort_begin_datum needs the int type information and int[] type information.
> > otherwise the cached TypeCacheEntry is being used to sort mult-dimension array,
> > which will make the result false.
>
> All these behaviors need more extensive testing.
>
> This brings me an extra question around the caching. Would the
> sorting be able to behave correctly when feeding to a single
> array_sort() context array values that have multiple COLLATE clauses?
> Or merge_collation_state() would be smart enough to make sure that
> collation conflicts never happen to begin with? I am wondering if we
> should worry about multiple VALUES, CTEs, or PL functions where
> array_sort() could be fed into its cache values that lead to
> unpredictible results for some values. This stuff should perhaps have
> more testing around such behaviors, stressing what kind of
> interactions we have between the sorting of multiple values and the
> caching, in the context of a single array_sort() call.
I'm afraid this can not be achieved in my current implementation, a simple
case is:
SELECT array_sort('{foo,bar,null,CCC,Abc,bbc}'::text[]);
{Abc,bar,bbc,CCC,foo,NULL}
SELECT array_sort('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C");
{Abc,CCC,bar,bbc,foo,NULL}
SELECT array_sort(a COLLATE "C") FROM (VALUES
('{foo,bar,null,CCC,Abc,bbc}'::text[] COLLATE "C"),
('{foo,bar,null,CCC,Abc,bbc}'::text[])) v(a);
{Abc,CCC,bar,bbc,foo,NULL}
{Abc,CCC,bar,bbc,foo,NULL}
Maybe add some documents to specify this?
> --
> Michael
--
Regards
Junwang Zhao
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2024-11-07 14:21:09 | Re: Deleting older versions in unique indexes to avoid page splits |
Previous Message | Junwang Zhao | 2024-11-07 13:56:39 | Re: general purpose array_sort |