Re: Wrong results with grouping sets

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Wrong results with grouping sets
Date: 2024-07-06 01:26:49
Message-ID: CAMbWs49dAwHo4i03B94T0p9N-JL8pG0m8m1-DViT3MHW_1uoQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 4, 2024 at 6:02 PM Ashutosh Bapat
<ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> On Mon, Jul 1, 2024 at 1:59 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > Here is an updated version of this patchset. I've run pgindent for it,
> > and also tweaked the commit messages a bit.
> >
> > In principle, 0001 can be backpatched to all supported versions to fix
> > the cases where there are subqueries in the grouping expressions; 0002
> > can be backpatched to 16 where we have the nullingrels stuff. But both
> > patches seem to be quite invasive. I'm not sure if we want to backpatch
> > them to stable branches. Any thoughts about backpatching?
>
> I don't have any specific thoughts on backpatching, but I have started
> reviewing the patches.

Thanks for reviewing this patchset!

> The first patch in the set adds a new RTEKind for GROUP. From prologue
> of RangeTblEntry structure I can not understand what an RTE represents
> especially when the RTE represents something other than a FROM clause
> item.
> ```
> * This is because we only need the RTE to deal with SQL features
> * like outer joins and join-output-column aliasing.) Other special
> * RTE types also exist, as indicated by RTEKind.
> ```
> I can not use this description to decide whether a GROUP BY construct
> should have an RTE for itself or not. It looks like the patch adds a
> new RTE (kind) here so that its rtindex can be used to differentiate
> between a and b from VALUES clause and those from the GroupingSet
> result in the query mentioned in the first email in this thread. But I
> don't see any discussion of other alternatives. For example, how about
> infrastructure in EC to tell which stages this EC is valid for/upto? I
> see Tom suggesting use of RTE instead of changing EC but I don't see
> why that's better. We do mark a RestrictInfo with relids above which
> it can be computed. Similarly we assess validity of EC by stages or
> relations being computed. That might open some opportunities for using
> broken ECs? We are almost reimplementing parts of the GROUPING set
> feature, so may be it's worth spending time thinking about it.

The reason why we need a new RTE for the grouping step is to address
cases where there are subqueries in the grouping expressions. In such
cases, each of these subqueries in the targetlist and HAVING clause is
expanded into distinct SubPlan nodes. Only one of these SubPlan nodes
would be converted to reference to the grouping key column output by
the Agg node; others would have to get evaluated afresh, and might not
go to NULL when they are supposed to. I do not think this can be
addressed by changing ECs.

> Assuming new RTEkind is the right approach, I am wondering whether
> there are other things that should have been represented by RTE for
> the same reason. For example, a HAVING clause changes the
> characteristics of results by introducing new constraints on the
> aggregated results. Should that have an RTE by itself? Will the
> RTEKind introduced by this patch cover HAVING clause as well?

AFAIU, HAVING clauses are just quals applied to the grouped rows after
groups and aggregates are computed. I cannot see why and how to add a
new RTE for HAVING.

> ```
> explain select sum(a), sum(b), stddev(a + b) from (values (1, 1), (2,
> 2)) as t(a, b) group by a, b having sum(a) = sum(b) order by 1, 2;
> QUERY PLAN
> -------------------------------------------------------------------------
> Sort (cost=0.10..0.10 rows=1 width=56)
> Sort Key: (sum("*VALUES*".column1)), (sum("*VALUES*".column2))
> -> HashAggregate (cost=0.06..0.09 rows=1 width=56)
> Group Key: "*VALUES*".column1, "*VALUES*".column2
> Filter: (sum("*VALUES*".column1) = sum("*VALUES*".column2))
> -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=8)
> (6 rows)
> ```
> Sort Key can be just (sum("*VALUES*".column1)) instead of both
> (sum("*VALUES*".column1)), (sum("*VALUES*".column2)) because of HAVING
> clause?

This looks like an optimization that can be achieved by hacking around
ECs. I'm not sure. But I think adding new RTEs does not help here.

> Some code level random comments
> 1.
> ```
> if (rte->rtekind == RTE_GROUP)
> {
> es->rtable_size--;
> break;
> ```
> because of the variable name, it would be interpreted as the size of
> es->rtable and will be expected to be the same as
> list_length(es->rtable) which it is not. The comment at the member
> declaration won't help much for a quick reader. All that variable is
> doing is to tell us whether to use alias as prefix or not;
> `useprefix = es->rtable_size > 1;` OR useprefix = (es->rtable_size > 1
> || es->verbose);.
> Instead of rtable_size, we could let the new member track the fact
> whether there are multiple aliases in the query (requiring multiple
> prefixes) instead of size of rtable. However, the fact that the GROUP
> RTE requires special handling indicates that the new RTEKind doesn't
> quite fit the rest of the set. No other RTE, even if outside FROM
> clause, required this special handling.

AFAIU we want to print prefixes on Vars when there are more than one
RTE entries to indicate which column is from which RTE entry. If
there is only one RTE (and not verbose), we try to avoid the prefixes.
This patch adds a new dummy RTE, resulting in plans that previously
had one RTE now having two and starting to print prefixes. This has
caused a lot of plan diffs in regression tests. That's why this patch
has to hack explain.c and ruleutils.c to make the varprefix stuff work
as it did before.

But I do not think this is alone for the new RTE. Consider

explain (costs off)
select sum(a) from (select * from t) having sum(a) = 1;
QUERY PLAN
--------------------------
Aggregate
Filter: (sum(t.a) = 1)
-> Seq Scan on t
(3 rows)

BTW, not related to the discussion here, I noticed an inconsistency
regarding the varprefix in the qual and targetlist. Look at:

explain (verbose, costs off)
select sum(a) from t having sum(a) = 1;
QUERY PLAN
----------------------------
Aggregate
Output: sum(a)
Filter: (sum(t.a) = 1)
-> Seq Scan on public.t
Output: a
(5 rows)

In the 'Filter' we add the prefix while in the 'Output' we do not.
Does anyone think this is something worth investigating?

> 2. expandRecordVariable: The comment below the change should be
> updated to explain why an output of GROUPing can not have RECORD or at
> least mention GROUPING there.

Thanks. Will add some comments here.

> 3. I see code like below in get_eclass_for_sort_expr() and
> mark_rels_nulled_by_join()
> ```
> /* ignore GROUP RTE */
> if (i == root->group_rtindex)
> continue;
> ```
> I assume that rel for this RTE index would be NULL, so "if" block just
> below this code would get executed. I think we should just change
> Assert() in that code block rather than adding a new "if" block to
> avoid confusion.

Actually I initially coded it as you suggested, and then moved the
check for the RTE_GROUP RTE out of the 'if' block later, in order to
maintain separate logic for GROUP RTE and outer joins. I'm not quite
sure which is better.

> 4. Looking at parse_clause.c most (if not all) addRangeTableEntry*
> function calls are from transform* functions. On those lines, I
> expected addRangeTableEntryForGroup() to be called from
> transformGroupClause(). Why are we calling
> addRangeTableEntryForGroup() from parseCheckAggregates()?

I think this is the most handy place to add the RTE_GROUP RTE, as the
join_flattened grouping expressions are available here.

> 5. In the parseCheckAggregates, we are replacing expressions from
> targetlist and havingQual with Vars pointing to GROUP RTE. But we are
> not doing that to sortClause, the remaining SQL construct. That's
> because sortClause is just a list of entries pointing back to
> targetlist. So there's nothing to change there. Am I right?

Well, it's not about that. Actually groupClause is also 'a list of
entries pointing back to targetlist'. The primary reason is that the
grouping step may result in some grouping expressions being set to
NULL, whereas the sorting step does not have this behavior.

> 6. I think ParseState::p_grouping_nsitem should be collocated with
> other ParseNamespaceItem members or lists in ParseState. I think it
> serves a similar purpose as them. Similarly PlannerInfo::group_rtindex
> should be placed next to outer_join_rels?

I agree that ParseState.p_grouping_nsitem should be moved to a more
proper place, and we should mention it in the comment for ParseState
too. But I'm not sure about the root.group_rtindex. I will give it
another thought later.

> 7. Do we need RangeTblEntry::groupexprs as a separate member? They are
> the same as GROUP BY or GROUPING SET expressions. So the list can be
> constructed from groupClause whenever required. Do we need to maintain
> the list separately? I am comparing with other RTEs, say Subquery RTE.
> We don't copy all the targetlist expressions from subquery to
> subquery's RTE. I noticed that groupexprs are being treated on lines
> similar to joinaliasvars. But they are somewhat different. The latter
> is a unified representation of columns of joining relations different
> from those columns and hence needs a new representation. That doesn't
> seem to be the case with RangeTblEntry::groupexpr.

We need to preprocess the grouping expressions first and then
substitute them back into the targetlist and havingQual. I don't
think this can be achieved without keeping groupexprs as a separate
member.

> 8. The change in process_sublinks_mutator() appears to be related to
> the fact that GROUPING() may have subqueries which were not being
> handled earlier. That change seems to be independent of the bug being
> fixed here. Am I right? If yes, having those changes in a separate
> patch will help.

No, I don't think so. Without this patch we should never see a
SubPlan/AlternativeSubPlan expression in process_sublinks_mutator,
because this is where SubPlans are created.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-07-06 02:05:46 Re: Use generation memory context for tuplestore.c
Previous Message Peter Geoghegan 2024-07-06 00:57:58 Re: Adding skip scan (including MDAM style range skip scan) to nbtree