Re: Eager aggregation, take 3

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Tender Wang <tndrwang(at)gmail(dot)com>, Paul George <p(dot)a(dot)george19(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Eager aggregation, take 3
Date: 2024-10-29 20:05:50
Message-ID: CA+Tgmob0q7bRbsFTVDMjxHE6zA4uDQLQa-s0CtwUw49V53UL_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 24, 2024 at 11:20 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> The reason is that it is very tricky to set the size estimates for a
> grouped join relation. For a non-grouped join relation, we know that
> all its paths have the same rowcount estimate (well, in theory). But
> this is not true for a grouped join relation. Suppose we have a
> grouped join relation for t1/t2 join. There might be two paths for
> it:

What exactly do you mean by "well, in theory" here? My understanding
of how things work today is that every relation is supposed to produce
a specific set of rows and every unparameterized path must produce
that set of rows. The order of the rows may vary but the set of rows
may not. With your proposed design here, that's no longer true.
Instead, what's promised is that the row sets will become equivalent
after a later FinalizeAggregate step. In a sense, this is like
parameterization or partial paths. Suppose I have:

SELECT * FROM foo, bar WHERE foo.x = bar.x;

While every unparameterized path for bar has the same row count,
there's also the possibility of performing an index scan on bar.x
parameterized by foo.x, and that path will have a far lower row count
than the unparameterized paths. Instead of producing all the same rows
as every other path, the parameterized path promises only that if run
repeatedly, with all relevant values of foo.x, you'll eventually get
all the same rows you would have gotten from the unparameterized path.
Because of this difference, parameterized paths need special handling
in many different parts of the code.

And the same thing is true of partial paths. They also do not promise
to generate all the same rows -- instead, they promise that when run
simultaneously across multiple workers, the total set of rows returned
across all invocations will be equal to what a normal path would have
produced. Here again, there's a need for special handling because
these paths behave differently than standard paths.

I think what you're doing here is roughly equivalent to either of
these two cases. It's more like the parameterized path case. Instead
of having a path for a relation which is parameterized by some input
parameter, you have a path for a relation, say bar, which is partially
aggregated by some grouping column. But there's no guarantee of how
much partial aggregation has been done. In your example, PartialAgg(t1
JOIN t2) is "more aggregated" than t1 JOIN PartialAgg(t2), so the row
counts are different. This makes me quite nervous. You can't compare a
parameterized path to an unparameterized path, but you can compare it
to another parameterized path if the parameterizations are the same.
You can't compare a partial path to a non-partial path, but you can
compare partial paths to each other. But with this work,
unparameterized, non-partial paths in the same RelOptInfo don't seem
like they are truly comparable. Maybe that's OK, but I'm not sure that
it isn't going to break other things.

You might for example imagine a design where PartialAgg(t1 JOIN t2)
and t1 JOIN PartialAgg(t2) get separate RelOptInfos. After all, there
are probably multiple ways to generate paths for each of those things,
and paths in each category can be compared to each other apples to
apples. What's less clear is whether it's fair to compare across the
two categories, and how many assumptions will be broken by doing so.
I'm not sure that it's right to have separate RelOptInfos; we
definitely don't want to create more RelOptInfos than necessary. At
the same time, if we mix together all of those paths into a single
RelOptInfo, we need to be confident that we're neither going to break
anything nor introduce too many special cases into hot code paths. For
instance, set_joinpath_size() represents an unwelcome complexity
increase that could impact performance generally, even apart from the
cases this patch intends to handle.

It's tempting to wonder if there's some way that we can avoid
generating paths for both PartialAgg(t1 JOIN t2) and t1 JOIN
PartialAgg(t2). Either the former has lower cardinality, or the latter
does. It seems likely that the lower-cardinality set is the winning
strategy. Even if the path has higher cost to generate, we save work
at every subsequent join level and at the final aggregation step. Are
there counterexamples where it's better to use a path from the
higher-cardinality set?

By the way, the work of figuring out what target list should be
produced by partial grouping is done by init_grouping_targets(), but
the comments seem to take it for granted that I know what result we're
trying to produce, and I don't. I think some more high-level
explanation of the goals of this code would be useful. It seems to me
that if I'm looking at a path for an ungrouped relation and it
produces a certain target list, then every column of that target list
is needed somewhere. If those columns are group keys, cool: we pass
those through. If they're inputs to the aggregates, cool: we feed them
to the aggregates. But if they are neither, then what? In the patch,
you either group on those columns or add them to the
possibly_dependent list depending on the result of
is_var_needed_by_join(). I can believe that there are some cases where
we can group on such columns and others where we can't, but find it
difficult to believe that this test reliably distinguishes between
those two cases. If it does, I don't understand why it does. Don't I
need to know something about how those columns are used in the upper
joins? Like, if those columns are connected by a chain of
binary-equality operators back to the user's choice of grouping
columns, that sounds good, but this test doesn't distinguish between
that case and an upper join on the < operator.
create_grouping_expr_infos() does reason based on whether there's an
equal-image operator available, but AIUI that's only reasoning about
the group columns the user mentioned, not the sort of implicit
grouping columns that init_grouping_targets() is creating.

I spent a lot of time thinking today about what makes it safe to push
down grouping or not. I'm not sure that I have a solid answer to that
question even yet. But it seems to me that there are at least two
cases that clearly won't fly. One is the case where the partial
aggregation would merge rows that need to be included in the final
aggregation with rows that should be filtered out. If the
partially-grouped relation simply has a filter qual, that's fine,
because it will be evaluated before the aggregation. But there might
be a qual that has to be evaluated later, either because (a) it
involves another rel, like this_rel.x + that_rel.y > 10 or (b) it
appears in the ON clause of an outer join and thus needs to be
deferred to the level of the OJ, e.g. A LEFT JOIN B ON a.x = b.x AND
b.y = 42. I wonder if you can comment on how these cases are handled.
Perhaps this coding around functional dependencies has something to do
with it, but it isn't clear to me.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2024-10-29 20:24:15 Re: Skip collecting decoded changes of already-aborted transactions
Previous Message Andreas Karlsson 2024-10-29 19:55:25 Re: SQL Property Graph Queries (SQL/PGQ)