Re: should we have a fast-path planning for OLTP starjoins?

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: should we have a fast-path planning for OLTP starjoins?
Date: 2025-02-10 01:32:47
Message-ID: efe843dc-70ce-4938-9b47-eb5cc40a2442@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2/5/25 09:27, Richard Guo wrote:
> On Wed, Feb 5, 2025 at 5:55 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Right now, if we have four tables to join, we have a joinlist
>> (A B C D). (Really they're integer relids, but let's use names here.)
>> If we decide to force C to be joined last, it should be sufficient to
>> convert this to ((A B D) C). If C and D both look like candidates for
>> this treatment, we can make it be (((A B) C) D) or (((A B) D) C).
>> This is pretty much the same thing that happens if you set
>> join_collapse_limit to 1 and use JOIN syntax to force a join order.
>> In fact, IIRC we start out with nested joinlists and there is some
>> code that normally flattens them until it decides it'd be creating
>> too large a sub-problem. I'm suggesting selectively reversing the
>> flattening.
>
> This should not be too difficult to implement. Outer joins seem to
> add some complexity, though. We need to ensure that the resulting
> joins in each sub-list are legal given the query's join order
> constraints. For example, if we make the joinlist be (((A B) C) D),
> we need to ensure that the A/B join and the (A/B)/C join are legal.
>

I've not done anything like this with joins before, but I AFAICS the
interesting stuff happens in deconstruct_recurse(), especially close to
the end where we check join_collapse_limit and do

joinlist = list_make2(leftpart, rightpart);

So I guess one way to "reverse the flattening" could be to do something
in deconstruct_recourse(). But I don't think that'd work all that well,
because of the recursion. We don't want to add a "pipeline break" into
the join list, we want to move the "dimension" to the end - even if only
within the group defined by join_collapse_limit.

E.g. imagine we have a join of 8 relations, with F (fact), dimensions D1
and D2, and then some artibrary tables T1, T2, T3, T4, T5. And let's say
deconstruct_recurse() sees them in this particular order

[F, T1, T2, D1, D2, T3, T4, T5]

AFAICS doing something in deconstruct_recurse() would likely split the
joinlist into four parts

[F, T1, T2] [D1] [D2] [T3, T4, T5]

which does treat the D1,D2 as if join_collapse_limit=1, but it also
splits the remaining relations into two groups, when we'd probably want
something more like this:

[F, T1, T2, T3, T4, T5] [D1] [D2]

Which should be legal, because a requirement is that D1/D2 don't have
any other join restrictions (I guess this could be relaxed a bit to only
restrictions within that particular group).

Which leads me to the conclusion that the best place to do this kind of
stuff is deconstruct_jointree(), once we have the "complete" joinlist.
We could walk it and reorder/split some of the joinlists again.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-02-10 01:35:06 Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.
Previous Message Hayato Kuroda (Fujitsu) 2025-02-10 01:28:41 RE: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided.