Re: Eager aggregation, take 3

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Eager aggregation, take 3
Date: 2024-06-13 08:07:51
Message-ID: CAMbWs49U8Sddx_fGszPdvA3jp_nheynxaqm5Y4NqMV21VBYAuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 20, 2024 at 4:12 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> Another rebase is needed after d1d286d83c. Also I realized that the
> partially_grouped_rel generated by eager aggregation might be dummy,
> such as in query:
>
> select count(t2.c) from t t1 join t t2 on t1.b = t2.b where false group by t1.a;
>
> If somehow we choose this dummy path with a Finalize Agg Path on top of
> it as the final cheapest path (a very rare case), we would encounter the
> "Aggref found in non-Agg plan node" error. The v7 patch fixes this
> issue.

I spent some time testing this patchset and found a few more issues.

One issue is that partially-grouped partial paths may have already been
generated in the process of building up the grouped join relations by
eager aggregation, in which case the partially_grouped_rel would contain
valid partial paths by the time we reach create_partial_grouping_paths.
If we subsequently find that parallelism is not possible for
partially_grouped_rel, we need to drop these partial paths; otherwise we
risk encountering Assert(subpath->parallel_safe) when creating gather /
gather merge path. This issue can be reproduced with the query below on
v7 patch.

create function parallel_restricted_func(a int) returns int as
$$ begin return a; end; $$ parallel restricted language plpgsql;
create table t (a int, b int, c int) with (parallel_workers = 2);
set enable_eager_aggregate to on;

explain (costs off)
select parallel_restricted_func(1) * count(t2.c)
from t t1, t t2 where t1.b = t2.b group by t2.c;

Another issue I found is that when we check to see whether a given Var
appears only within Aggrefs, we need to account for havingQual in
addition to targetlist; otherwise there's a risk of omitting this Var
from the targetlist of the partial Agg node, leading to 'ERROR: variable
not found in subplan target list'. This error can be reproduced with
the query below on v7.

create table t (a int primary key, b int, c int);
set enable_eager_aggregate to on;

explain (costs off)
select count(*) from t t1, t t2 group by t1.a having min(t1.b) < t1.b;
ERROR: variable not found in subplan target list

A third issue I found is that with v7 we might push the Partial Agg to
the nullable side of an outer join, which is not correct. This happens
because when determining whether a Partial Agg can be pushed down to a
relation, the v7 patchset indeed checks if the aggregate expressions can
be evaluated at this relation level. However, it overlooks checking the
grouping expressions. The grouping expressions can originate from two
sources: the original GROUP BY clauses, or constructed from join
conditions. In either case, we must verify that the grouping
expressions cannot be nulled by outer joins that are above the current
relation, otherwise the Partial Agg cannot be pushed down to this rel.

Hence here is the v8 patchset, with fixes for all the above issues.

Thanks
Richard

Attachment Content-Type Size
v8-0001-Introduce-RelInfoList-structure.patch application/octet-stream 14.3 KB
v8-0002-Introduce-RelAggInfo-structure-to-store-info-for-grouped-paths.patch application/octet-stream 7.8 KB
v8-0003-Set-up-for-eager-aggregation-by-collecting-needed-infos.patch application/octet-stream 14.3 KB
v8-0004-Implement-functions-that-create-RelAggInfos-if-applicable.patch application/octet-stream 29.7 KB
v8-0005-Implement-functions-that-generate-paths-for-grouped-relations.patch application/octet-stream 13.1 KB
v8-0006-Build-grouped-relations-out-of-base-relations.patch application/octet-stream 9.0 KB
v8-0007-Build-grouped-relations-out-of-join-relations.patch application/octet-stream 27.0 KB
v8-0008-Add-test-cases.patch application/octet-stream 71.5 KB
v8-0009-Add-README.patch application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2024-06-13 08:09:01 Re: Addressing SECURITY DEFINER Function Vulnerabilities in PostgreSQL Extensions
Previous Message Amit Langote 2024-06-13 08:04:29 Re: Incorrect matching of sql/json PASSING variable names