Re: general purpose array_sort

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Junwang Zhao <zhjwpku(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: general purpose array_sort
Date: 2024-11-04 07:16:35
Message-ID: CACJufxGjhNmXvJubV4BWgskjYDBfd5dv-cjQhuy0drDtK_uHdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 4, 2024 at 1:46 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
>
> + typentry = lookup_type_cache(elmtyp, TYPECACHE_LT_OPR);
> + if (!OidIsValid(typentry->lt_opr))
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_FUNCTION),
> + errmsg("could not identify ordering operator for type %s",
> + format_type_be(elmtyp))));
>
> The patch introduces two error paths based on the fact that ordering
> operators could not be found depending on a data type that lacks the
> ordering operator and the array ordering operator part. It is right
> to issue an error if these are lacking, like the various stats paths.
> Should we have some regression tests with specific data types for
> these errors, though? The stats paths don't care much about these
> error cases, but it does not mean that we should not care about them.
> In short, let's have negative test coverage if we can.
>
select distinct oprleft::regtype from pg_operator where oprname = '='
and oprleft = oprright
except all
select distinct oprleft::regtype from pg_operator where oprname = '<'
and oprleft = oprright;
returns

hstore
cid
aclitem
xid
line

simple tests case using xid data type would be

SELECT array_sort('{{1,2,3}}'::xid[]);

> +typedef struct ArraySortCachedInfo
> +{
> + TypeCacheEntry *typentry;
> + TypeCacheEntry *array_typentry;
> +} ArraySortCachedInfo;
>
> Let's put that at the top of the file, with a comment about how it
> links to array_sort() for the caching with fn_extra. Let's also
> document the meaning of the fields.
>
> FWIW, I am confused by this implementation, where you have to allocate
> the two TypeCacheEntry because of the fact that you have to deal with
> the 1-dimension case and the multi-dimension case. In the context of
> a single function call, why do you need both typentry and
> array_typentry, actually? Wouldn't it be enough to use one typentry
> that points to the typcache, meaning that you don't really need to use
> the extra business with fn_mcxt, no? If you require both (because I
> may be wrong), perhaps you should have a regression test that's able
> to break when removing array_typentry, changing the code to only rely
> on typentry. Note: I have just removed array_typentry in a quick
> test, current coverage was happy about it. Feel free to prove me
> wrong.
>

drop table if exists t;
CREATE TABLE t (a int[]);
insert into t values ('{1,3}'),('{1,2,3}'),('{11}');
insert into t values ('{{1,12}}'), ('{{4,3}}');
SELECT array_sort(a) from t;

In the above case,
tuplesort_begin_datum needs the int type information and int[] type information.
otherwise the cached TypeCacheEntry is being used to sort mult-dimension array,
which will make the result false.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-11-04 07:25:00 Re: Clear padding in PgStat_HashKey keys
Previous Message Michael Paquier 2024-11-04 06:13:48 Re: Allowing pg_recvlogical to create temporary replication slots