From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Junwang Zhao <zhjwpku(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "andreas(at)proxel(dot)se" <andreas(at)proxel(dot)se>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Subject: | Re: general purpose array_sort |
Date: | 2024-11-15 09:05:00 |
Message-ID: | CACJufxGGka_u_pyB=AntZOQTLEfq7VnRG31C4Nk8r7WMEoS6CA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 8, 2024 at 8:52 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> I am wondering if there are more fancy cases where the saved cache
> could force a state that would lead to puzzling results, say with
> different collations that should be applied. I'd recommend to
> research that more, to reflect that in the docs and to add tests that
> show what we should expect in these cases within 0001 because this
> new function is mimicking in the context of a function execution
> multiple query clauses where restrictions are applied when analyzing
> the query, close to the parser.
>
> For example, UNION and UNION ALL require a common collation when
> processing a set of expressions related to them, which would be OK.
> Perhaps I lack some imagination to be able to break things.
> --
We had 3 error occurrences of
ERROR: could not determine which collation to use for string comparison
in collate.linux.utf8.out.
one is UNION ALL, another two is do comparison with two text arguments.
here array_sort only takes one argument, there is not that much place
to go wrong?
potential misbehavior would be only about UNION ALL?
UNION ALL for two tables, for collation, we can both implicit; both
explicit' one implicit,one explicit.
if both explicit, then it will error out quite easily.
if one side is explicit, another side explicitly, then we use
explicitly, which is what we expected.
the trick is that both are implicit.
drop table t1,t2;
create table t1(a int, b text[] COLLATE "C");
create table t2(a int, b text[] COLLATE case_sensitive);
insert into t1 values (1, '{foo,bar,null,CCC,Abc,bbc}'::text[]);
insert into t2 values (2, '{foo,bar,null,CCC,Abc,bbc}'::text[]);
create domain dtxt as text[] collate case_insensitive;
CREATE OR REPLACE FUNCTION mytxt_coll(x text[]) RETURNS dtxt LANGUAGE
plpgsql AS $$
declare
xx text[] COLLATE case_insensitive;
begin
xx := x;
return xx collate case_insensitive;
end
$$;
--these three fail.
select array_sort(b) from (select b from t1 union all select b from t2) sub;
select array_sort(b) from (select mytxt_coll(b) as b from t2 union all
select b from t1 ) sub;
select array_sort(b) from (select b from
mytxt_coll('{foo,bar,null,CCC,Abc,bbc}'::text[] collate
case_insensitive) f(b) union all select b from t1) sub;
-----
select array_sort(b) from (select b from t1 union all select b from
mytxt_coll('{foo,bar,null,CCC,Abc,bbc}'::text[]) f(b)) sub;
select array_sort(b) from (select b from
mytxt_coll('{foo,bar,null,CCC,Abc,bbc}'::text[]) f(b) union all select
b from t1 ) sub;
these two query outputs are the same, which is what we expected per
quote from manual:
<<>>
otherwise, all input expressions must have the same implicit collation
derivation or the default collation.
If any non-default collation is present, that is the result of the
collation combination.
Otherwise, the result is the default collation.
<<>>
https://www.postgresql.org/docs/current/collation.html#COLLATION-CONCEPTS
also we have varstr_sortsupport->check_collation_set to make sure we
have a single valid collation for array_sort.
overall, I think the current implementation works fine.
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2024-11-15 09:18:50 | Re: Potential ABI breakage in upcoming minor releases |
Previous Message | Aleksander Alekseev | 2024-11-15 08:56:45 | Re: Potential ABI breakage in upcoming minor releases |