From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> |
Cc: | Robins Tharakan <tharakan(at)gmail(dot)com>, "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, tndrwang(at)gmail(dot)com |
Subject: | Re: Add semi-join pushdown to postgres_fdw |
Date: | 2025-03-25 10:55:50 |
Message-ID: | CAPpHfdt3m8DdyMfLK_mMcikno3ZA0DWsca+2JD=kVyF1DBNGWA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Mar 24, 2025 at 6:56 PM Alexander Pyhalov
<a(dot)pyhalov(at)postgrespro(dot)ru> wrote:
> Alexander Korotkov писал(а) 2025-03-24 11:49:
> > On Mon, Mar 24, 2025 at 9:07 AM Alexander Pyhalov
> > <a(dot)pyhalov(at)postgrespro(dot)ru> wrote:
> >> Alexander Korotkov писал(а) 2025-03-24 04:21:
> >> > Hi, Alexander!
> >> >
> >> > On Tue, Mar 18, 2025 at 6:04 PM Alexander Pyhalov
> >> > <a(dot)pyhalov(at)postgrespro(dot)ru> wrote:
> >> >> This shouldn't. When semi-join is found below left/right join, it's
> >> >> deparsed as subquery.
> >> >> Interesting enough, this mechanics (deparsing as subquery) is used
> >> >> 1) for semi-joins under left/right join,
> >> >> 2) for full outer joins when inner or outer part has some
> >> >> remote_conds.
> >> >>
> >> >> The issue here is that after subquery is deparsed, we don't consider
> >> >> if
> >> >> its target attributes are available to the upper level
> >> >> join . As for semi-join itself, all conditions are still deparsed on
> >> >> left/right join boundary, they are just not propagated further.
> >> >> This shouldn't be a problem, as they are evaluated in subquery. As for
> >> >> left/right join without semi-join beneath it - its behavior is not
> >> >> affected
> >> >> (as hidden_subquery_rels is empty).
> >> >
> >> > Thank you for the explanation. But I have another question. Aren't
> >> > the checks you've proposed too strict? hidden_subquery_rels are
> >> > propagated all the way to the join tree. So, pulling conditions would
> >> > be disables all the way to the join tree too. Is it enough to just
> >> > disable pulling conditions directly from semi-joins, then their
> >> > further pulls will be disabled automatically? See the attached patch.
> >> > It also contains other (mostly cosmetic improvements).
> >> >
> >> > ------
> >> > Regards,
> >> > Alexander Korotkov
> >> > Supabase
> >>
> >> Hi. No, they are not too strict. Look at the following example
> >>
> >> EXPLAIN (verbose, costs off)
> >> SELECT x1.c1 FROM
> >> (SELECT * FROM ft2 WHERE EXISTS (SELECT 1 FROM ft4
> >> WHERE
> >> ft4.c1 = ft2.c1 AND ft2.c2 < 10)) x1
> >> RIGHT JOIN
> >> (SELECT * FROM ft2 WHERE EXISTS (SELECT 1 FROM ft4
> >> WHERE
> >> ft4.c1 = ft2.c1 AND ft2.c2 < 10)) x2
> >> ON (x1.c1 = x2.c1)
> >> LEFT JOIN
> >> (SELECT * FROM ft2 WHERE c2 < 11) x3
> >> ON (x1.c1 = x3.c1)
> >> ORDER BY x1.c1 LIMIT 10;
> >>
> >> With patch which you suggest, we'll deparse left part of left join as
> >> subquery, but will try to pop c2 < 10 condition from
> >> (8) LEFT JOIN ((6) SEMI JOIN (7)) subquery. When we look at left join
> >> of
> >> this subquery and ft2, we still deparse left part as
> >> subquery, so can't pop up conditions from it.
> >
> > I've checked, this query seems to result in the exactly same remote
> > SQLs with your and mine patches. Could you elaborate more on the
> > difference? Do you think foreign_join_ok() can give different results
> > on this query?
>
> Hi.
> With your patch this example gives the same
> ERROR: unexpected expression in subquery output
>
> This happens, because we don't keep knowledge that we have deparsed all
> semi-joins below this left join. As long as left/right join has
> semi-join in its left or right part, this part will be deparsed as
> subquery (look at the following lines in foreign_join_ok()):
>
> else if (jointype == JOIN_LEFT || jointype == JOIN_RIGHT ||
> jointype == JOIN_FULL)
> {
> /*
> * Conditions, generated from semi-joins, should be
> evaluated before
> * LEFT/RIGHT/FULL join.
> */
> if (!bms_is_empty(fpinfo_o->hidden_subquery_rels))
> {
> fpinfo->make_outerrel_subquery = true;
> fpinfo->lower_subquery_rels =
> bms_add_members(fpinfo->lower_subquery_rels, outerrel->relids);
> }
>
> if (!bms_is_empty(fpinfo_i->hidden_subquery_rels))
> {
> fpinfo->make_innerrel_subquery = true;
> fpinfo->lower_subquery_rels =
> bms_add_members(fpinfo->lower_subquery_rels, innerrel->relids);
> }
> }
>
>
> So, we still can't refer to its remote_conds from upper level queries
> (as not all Vars are available from subquery after subquery is created
> in one part of left or right join). It's not necessary to have semi-join
> for this as immediate left/right join inner or outer for inner/outer to
> be deparsed as subquery. But it shouldn't be an issue - we've already
> used remote_conds when created this subquery.
> What I'm trying to say - logic of 'making subquery' and extracting
> conditions should match (or we need more sophisticated way of forming
> subquery targetlist, so that extracted conditions could be used above
> subqueries).
Thank you for the explanation. Pushed. However, it would be nice in
future to rework this in a way that semi-joins on lower levels of join
tree don't have negative impact on optimization of upper levels of
join tree.
------
Regards,
Alexander Korotkov
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2025-03-25 10:57:03 | Re: Remove an unnecessary check on semijoin_target_ok() on postgres_fdw.c |
Previous Message | Yura Sokolov | 2025-03-25 10:52:00 | Re: sinvaladt.c: remove msgnumLock, use atomic operations on maxMsgNum |