From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Propagate pathkeys from CTEs up to the outer query |
Date: | 2024-03-27 06:52:58 |
Message-ID: | CAMbWs49qUFhW1=Mm9XSVeZn5M7aUgGJPqWGQbf9nw6PnoObx9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 27, 2024 at 1:20 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Richard Guo <guofenglinux(at)gmail(dot)com> writes:
> > I agree with your points. Previously I was thinking that CTEs were the
> > only scenario where we needed to remember the best path and only
> > required the best path's pathkeys. However, considering potential
> > future use cases as you mentioned, I concur that having a per-subplan
> > list of paths would be more future-proof. Please see attached v4 patch.
>
> Hm, well, you didn't actually fill in the paths for the other
> subqueries. I agree that it's not worth doing so in
> SS_make_initplan_from_plan, but a comment explaining that decision
> seems in order. Also, there's nothing stopping us from saving the
> path for subplans made in build_subplan, except adding a parameter
> to pass them down. So I did that, and made a couple other cosmetic
> changes, and pushed it.
Thanks for the adjustments and pushing!
> That's not the fault of anything we did here; the IndexOnlyScan path
> in the subquery is in fact not marked with any pathkeys, even though
> clearly its result is sorted. I believe that's an intentional
> decision from way way back, that pathkeys only correspond to orderings
> that are of interest in the current query level. "select unique1 from
> tenk1 b order by unique1" has an interest in ordering by unique1,
> but "select unique1 from tenk1 b" does not, so it's choosing that
> path strictly according to cost. Not generating pathkeys in such a
> query saves a few cycles and ensures that we won't improperly prefer
> a path on the basis of pathkeys if it hasn't got a cost advantage.
> So I'm quite hesitant to muck with that old decision, especially in
> the waning days of a development cycle, but the results do feel a
> little strange here.
Yeah, I also noticed this while writing the test case. That's why I
added 'order by unique1' explicitly in the CTE subquery. This also
happens to subquery RTEs, such as
explain (costs off)
select * from (select unique1 from tenk1 offset 0) order by unique1;
QUERY PLAN
----------------------------------------------------
Sort
Sort Key: tenk1.unique1
-> Index Only Scan using tenk1_unique1 on tenk1
(3 rows)
I agree that mucking with the old decision might not be a good idea. In
addition, for a MATERIALIZED CTE, generating pathkeys according to the
outer query's ordering requirements breaks the idea of optimization
fence: the outer query should not affect the plan chosen for the CTE
query.
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2024-03-27 06:58:06 | Re: pgsql: Track last_inactive_time in pg_replication_slots. |
Previous Message | John Naylor | 2024-03-27 06:44:10 | Re: Change GUC hashtable to use simplehash? |