From: | Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> |
---|---|
To: | "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)mitsubishielectric(dot)co(dot)jp> |
Cc: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Ian Lawrence Barwick <barwick(at)gmail(dot)com> |
Subject: | Re: Add semi-join pushdown to postgres_fdw |
Date: | 2022-12-06 09:28:43 |
Message-ID: | 816fa8b1bc2da09a87484d1ef239a332@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, Yuki.
Thanks for looking at this patch.
Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp писал 2022-12-03 06:02:
> question1)
> > + if (jointype == JOIN_SEMI && bms_is_member(var->varno,
> innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))
> It takes time for me to find in what case this condition is true.
> There is cases in which this condition is true for semi-join of two
> baserels
> when running query which joins more than two relations such as
> query2 and query3.
> Running queries such as query2, you maybe want to pushdown of only
> semi-join path of
> joinrel(outerrel) defined by (f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1)
> and baserel(innerrel) f_t3
> because of safety deparse. So you add this condition.
> Becouase of this limitation, your patch can't push down subquery
> expression
> "exists (select null from f_t2 where c1 = a1.c1)" in query3.
> I think, it is one of difficulty points for semi-join pushdown.
> This is my understanding of the intent of this condition and the
> restrictions imposed by this condition.
> Is my understanding right?
IIRC, planner can create semi-join, which targetlist references Vars
from inner join relation. However, it's deparsed as exists and so we
can't reference it from SQL. So, there's this check - if Var is
referenced in semi-join target list, it can't be pushed down.
You can see this if comment out this check.
EXPLAIN (verbose, costs off)
SELECT ft2.*, ft4.* FROM ft2 INNER JOIN
(SELECT * FROM ft4 WHERE EXISTS (
SELECT 1 FROM ft2 WHERE ft2.c2=ft4.c2)) ft4
ON ft2.c2 = ft4.c1
INNER JOIN
(SELECT * FROM ft2 WHERE EXISTS (
SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)) ft21
ON ft2.c2 = ft21.c2
WHERE ft2.c1 > 900
ORDER BY ft2.c1 LIMIT 10;
will fail with
EXPLAIN SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT
NULL FROM "S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2))))
Here you can see that
SELECT * FROM ft2 WHERE EXISTS (
SELECT 1 FROM ft4 WHERE ft2.c2=ft4.c2)
was transformed to
SELECT r8.c2, r9.c2 FROM "S 1"."T 1" r8 WHERE (EXISTS (SELECT NULL FROM
"S 1"."T 3" r9 WHERE ((r8.c2 = r9.c2))))
where our exists subquery is referenced from tlist. It's fine for plan
(relations, participating in semi-join, can be referenced in tlist),
but is not going to work with EXISTS subquery.
BTW, there's a comment in joinrel_target_ok(). It tells exactly that -
5535 if (jointype == JOIN_SEMI && bms_is_member(var->varno,
innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))
5536 {
5537 /* We deparse semi-join as exists() subquery, and
so can't deparse references to inner rel in join target list. */
5538 ok = false;
5539 break;
5540 }
Expanded comment.
> question2) In foreign_join_ok
> > * Constructing queries representing ANTI joins is hard, hence
> Is this true? Is it hard to expand your approach to ANTI join
> pushdown?
I haven't tried, so don't know.
> question3) You use variables whose name is "addl_condXXX" in the
> following code.
> > appendStringInfo(addl_conds, "EXISTS (SELECT NULL FROM %s",
> join_sql_i.data);
> Does this naming mean additional literal?
> Is there more complehensive naming, such as "subquery_exprXXX"?
The naming means additional conditions (for WHERE clause, by analogy
with ignore_conds and remote_conds). Not sure if subquery_expr sounds
better, but if you come with better idea, I'm fine with renaming them.
> question4) Although really detail, there is expression making space
> such as
> "ft4.c2 = ft2.c2" and one making no space such as "c1=ftupper.c1".
> Is there reason for this difference? If not, need we use same policy
> for making space?
>
Fixed.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachment | Content-Type | Size |
---|---|---|
v3-0001-postgres_fdw-add-support-for-deparsing-semi-joins.patch | text/x-diff | 49.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2022-12-06 09:48:54 | Re: Allow batched insert during cross-partition updates |
Previous Message | Richard Guo | 2022-12-06 09:00:59 | A problem about ParamPathInfo for an AppendPath |