| From: | Tomas Vondra <tomas(at)vondra(dot)me> | 
|---|---|
| To: | Richard Guo <guofenglinux(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
| Cc: | Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: should we have a fast-path planning for OLTP starjoins? | 
| Date: | 2025-02-10 01:32:47 | 
| Message-ID: | efe843dc-70ce-4938-9b47-eb5cc40a2442@vondra.me | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 2/5/25 09:27, Richard Guo wrote:
> On Wed, Feb 5, 2025 at 5:55 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Right now, if we have four tables to join, we have a joinlist
>> (A B C D).  (Really they're integer relids, but let's use names here.)
>> If we decide to force C to be joined last, it should be sufficient to
>> convert this to ((A B D) C).  If C and D both look like candidates for
>> this treatment, we can make it be (((A B) C) D) or (((A B) D) C).
>> This is pretty much the same thing that happens if you set
>> join_collapse_limit to 1 and use JOIN syntax to force a join order.
>> In fact, IIRC we start out with nested joinlists and there is some
>> code that normally flattens them until it decides it'd be creating
>> too large a sub-problem.  I'm suggesting selectively reversing the
>> flattening.
> 
> This should not be too difficult to implement.  Outer joins seem to
> add some complexity, though.  We need to ensure that the resulting
> joins in each sub-list are legal given the query's join order
> constraints.  For example, if we make the joinlist be (((A B) C) D),
> we need to ensure that the A/B join and the (A/B)/C join are legal.
> 
I've not done anything like this with joins before, but I AFAICS the
interesting stuff happens in deconstruct_recurse(), especially close to
the end where we check join_collapse_limit and do
joinlist = list_make2(leftpart, rightpart);
So I guess one way to "reverse the flattening" could be to do something
in deconstruct_recourse(). But I don't think that'd work all that well,
because of the recursion. We don't want to add a "pipeline break" into
the join list, we want to move the "dimension" to the end - even if only
within the group defined by join_collapse_limit.
E.g. imagine we have a join of 8 relations, with F (fact), dimensions D1
and D2, and then some artibrary tables T1, T2, T3, T4, T5. And let's say
deconstruct_recurse() sees them in this particular order
[F, T1, T2, D1, D2, T3, T4, T5]
AFAICS doing something in deconstruct_recurse() would likely split the
joinlist into four parts
[F, T1, T2] [D1] [D2] [T3, T4, T5]
which does treat the D1,D2 as if join_collapse_limit=1, but it also
splits the remaining relations into two groups, when we'd probably want
something more like this:
[F, T1, T2, T3, T4, T5] [D1] [D2]
Which should be legal, because a requirement is that D1/D2 don't have
any other join restrictions (I guess this could be relaxed a bit to only
restrictions within that particular group).
Which leads me to the conclusion that the best place to do this kind of
stuff is deconstruct_jointree(), once we have the "complete" joinlist.
We could walk it and reorder/split some of the joinlists again.
regards
-- 
Tomas Vondra
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Smith | 2025-02-10 01:35:06 | Re: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. | 
| Previous Message | Hayato Kuroda (Fujitsu) | 2025-02-10 01:28:41 | RE: Enhance 'pg_createsubscriber' to retrieve databases automatically when no database is provided. |