From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Partition pruning on parameters grouped into an array does not prune properly |
Date: | 2025-03-27 00:58:56 |
Message-ID: | CAApHDvoWxu0xO-XXRkrUfzqevm5yryNFSBWoUUMsV5Rwa6fKtQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 27 Mar 2025 at 04:19, Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> But if we partition on HASH(x,y) it is not working (see
> incorrect-pruning-example.sql):
>
> PREPARE test2 (int,int) AS
> SELECT 1 FROM array_prune
> WHERE id1 = ANY(ARRAY[$1]) AND id2 = ANY(ARRAY[$2]);
> EXPLAIN (COSTS OFF) EXECUTE test2(1,-1);
>
> Append
> -> Seq Scan on array_prune_t0 array_prune_1
> Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2])))
> -> Seq Scan on array_prune_t1 array_prune_2
> Filter: ((id1 = ANY (ARRAY[$1])) AND (id2 = ANY (ARRAY[$2])))
It is a bug. This is down to how match_clause_to_partition_key()
handles ScalarArrayOpExpr. To save some complexity in the handling of
ScalarArrayOpExpr, these get transformed into OpExprs, one for each
item in the ScalarArrayOpExpr. Look for the call to make_opclause()
in match_clause_to_partition_key(). Just a few lines down, you see
that we recursively call gen_partprune_steps_internal() to pass down
the OpExprs that we just generated. The problem is that the recursive
call only contains the OpExprs generated for one of the
ScalarArrayOpExpr, gen_prune_steps_from_opexps() requires equality
quals (or at least an key IS NULL qual) for all partitioned keys for
hash partitioning, otherwise it'll bail out on the following:
if (part_scheme->strategy == PARTITION_STRATEGY_HASH &&
clauselist == NIL && !bms_is_member(i, nullkeys))
return NIL;
I wonder if we need to redesign this to not do that recursive
processing and instead have it so match_clause_to_partition_key() can
generate multiple PartClauseInfos. If we've matched to the
ScalarArrayOpExpr then I think each generated PartClauseInfo should
have the same PartClauseMatchStatus. That would also get rid of the
(kinda silly) overhead we have of having to match the
ScalarArrayOpExpr to the partition key, then generating OpExprs and
having to match those again, even though we know they will match.
I suspect the fix for this might be a bit invasive to backpatch. Maybe
it's something we can give a bit more clear thought to after the
freeze is over.
David
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-03-27 01:07:40 | Re: AIO v2.5 |
Previous Message | Melanie Plageman | 2025-03-27 00:48:21 | Re: read stream on amcheck |