From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> |
Cc: | "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com> |
Subject: | Re: [idea] more aggressive join pushdown on postgres_fdw |
Date: | 2015-06-04 21:43:42 |
Message-ID: | CA+TgmoYQJX1K-e_tObXuhmGS7pn2kWN9zm=W3PKrP7o6khCFFQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, May 30, 2015 at 9:03 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> Yesterday, JPUG held an unconference event at Tokyo, and
> Hanada-san had a talk about join-pushdown feature of
> postgres_fdw.
> At this talk, someone proposed an interesting idea to
> make join pushdown more aggressive/effective.
> Let me share it with pgsql-hackers.
>
> He said, we may have a workload to join a large foreign-
> scan and a small local-scan regardless of the plan type.
>
> For example:
> joinrel (expected nrows = 5)
> + outerrel ForeignScan (expected nrows = 1000000)
> + innerrel LocalScan (expected nrows = 5)
>
> In this case, we may be able to run the entire joinrel
> on the remote side then fetch just 5 rows, if fdw-driver
> construct VALUES() clause according to the contents of
> LocalScan then makes an entire join query with another
> one kept in ForeignScan.
>
> If above ForeignScan have the following remote query,
> SELECT a, b, c FROM t0 WHERE d < 1000000
> we may be able to construct the query below to run remote
> join with local (small) relation.
>
> SELECT a, b, c, x, y FROM
> (SELECT a, b, c FROM t0 WHERE d < 1000000) AS ft
> JOIN
> (VALUES (1,'aaa'), (2,'bbb'), (3,'ccc'),
> (4,'ddd'), (5,'eee')) AS lt (x, y)
> ON ft.a = lt.x
>
> The VALUES clauses can be mechanically constructed according
> to the result set of LocalScan, and it is not difficult to
> make such a remote query on top of the existing ForeignScan.
> In the result, it will reduce amount of network traffic and
> CPU cycles to form/deform tuples dramatically.
>
> I don't intend to implement this idea urgently (of course,
> join pushdown for both ForeignScan case has higher priority),
> however, it makes sense to keep the future direction in mind.
>
> Also, as an aside, even though Hanada-san mentioned ForeignScan
> does not need an infrastructure to initialize child path nodes,
> this idea may require ForeignScan to have local child path.
Neat idea. This ties into something I've thought about and mentioned
before: what if the innerrel is local, but there's a replicated copy
on the remote server? Perhaps both cases are worth thinking about at
some point.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-06-04 21:51:09 | Re: Minor improvement to func.sgml |
Previous Message | Robert Haas | 2015-06-04 21:35:00 | Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 |