From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Richard Guo <guofenglinux(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pg16: XX000: could not find pathkey item to sort |
Date: | 2023-10-05 06:26:24 |
Message-ID: | CAApHDvqoMf_kxQKON+FLQh8DbMKe62R=OdrZoenb6xDYWmQVUQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 3 Oct 2023 at 20:16, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I wonder if the attached patch is too much of a special case fix. I
> guess from the lack of complaints previously that there are no other
> cases where we could possibly have pathkeys that belong to columns
> that are aggregated. I've not gone to much effort to see if I can
> craft a case that hits this without the ORDER BY/DISTINCT aggregate
> optimisation, however.
I spent more time on this today. I'd been wondering if there was any
reason why create_agg_path() would receive a subpath with pathkeys
that were anything but the PlannerInfo's group_pathkeys. I mean, how
could we do Group Aggregate if it wasn't? I wondered if grouping sets
might change that, but it seems the group_pathkeys will be set to the
initial grouping set.
Given that, it would seem it's safe to just trim off any pathkey that
was added to the group_pathkeys by
adjust_group_pathkeys_for_groupagg().
PlannerInfo.num_groupby_pathkeys marks the number of pathkeys that
existed in group_pathkeys before adjust_group_pathkeys_for_groupagg()
made any additions, so we can just trim the list length back to that.
I've done this in the attached patch. I also considered if it was
worth adding a regression test for this and I concluded that there are
better ways to test for this and considered if we should add some code
to createplan.c to check that all Path pathkeys have corresponding
items in the PathTarget. I've included an additional patch which adds
some code in USE_ASSERT_CHECKING builds to verify this. Without the
fix it's simple enough to trigger this with a query such as:
select two,count(distinct four) from tenk1 group by two order by two;
Without the fix the additional asserts cause the regression tests to
fail, but with the fix everything passes.
Justin's case is quite an obscure way to hit this as it requires
partitionwise aggregation plus a single partition so that the Append
is removed due to only having a single subplan in setrefs.c. If there
had been 2 partitions, then the AppendPath wouldn't have inherited the
subpath's pathkeys per code at the end of create_append_path().
So in short, I propose the attached fix without any regression tests
because I feel that any regression test would just mark that there was
a big in create_agg_path() and not really help with ensuring we don't
end up with some similar problem in the future.
I have some concerns that the assert_pathkeys_in_target() function
might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm
not proposing to commit that without further discussion.
Does anyone feel differently?
If not, I plan to push the attached
strip_aggregate_pathkeys_from_aggpaths_v2.patch early next week.
David
Attachment | Content-Type | Size |
---|---|---|
strip_aggregated_pathkeys_from_aggpaths_v2.patch | application/octet-stream | 1.3 KB |
assert_pathkeys_in_target.patch | application/octet-stream | 1.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Drouvot, Bertrand | 2023-10-05 06:46:48 | Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag |
Previous Message | Gurjeet Singh | 2023-10-05 05:41:15 | Re: [PoC/RFC] Multiple passwords, interval expirations |