From: | Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Shigeru Hanada" <shigeru(dot)hanada(at)gmail(dot)com> |
Subject: | Re: [idea] more aggressive join pushdown on postgres_fdw |
Date: | 2015-06-05 01:40:20 |
Message-ID: | 9A28C8860F777E439AA12E8AEA7694F8010F47DA@BPXM15GP.gisp.nec.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Sat, May 30, 2015 at 9:03 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
> > Yesterday, JPUG held an unconference event at Tokyo, and
> > Hanada-san had a talk about join-pushdown feature of
> > postgres_fdw.
> > At this talk, someone proposed an interesting idea to
> > make join pushdown more aggressive/effective.
> > Let me share it with pgsql-hackers.
> >
> > He said, we may have a workload to join a large foreign-
> > scan and a small local-scan regardless of the plan type.
> >
> > For example:
> > joinrel (expected nrows = 5)
> > + outerrel ForeignScan (expected nrows = 1000000)
> > + innerrel LocalScan (expected nrows = 5)
> >
> > In this case, we may be able to run the entire joinrel
> > on the remote side then fetch just 5 rows, if fdw-driver
> > construct VALUES() clause according to the contents of
> > LocalScan then makes an entire join query with another
> > one kept in ForeignScan.
> >
> > If above ForeignScan have the following remote query,
> > SELECT a, b, c FROM t0 WHERE d < 1000000
> > we may be able to construct the query below to run remote
> > join with local (small) relation.
> >
> > SELECT a, b, c, x, y FROM
> > (SELECT a, b, c FROM t0 WHERE d < 1000000) AS ft
> > JOIN
> > (VALUES (1,'aaa'), (2,'bbb'), (3,'ccc'),
> > (4,'ddd'), (5,'eee')) AS lt (x, y)
> > ON ft.a = lt.x
> >
> > The VALUES clauses can be mechanically constructed according
> > to the result set of LocalScan, and it is not difficult to
> > make such a remote query on top of the existing ForeignScan.
> > In the result, it will reduce amount of network traffic and
> > CPU cycles to form/deform tuples dramatically.
> >
> > I don't intend to implement this idea urgently (of course,
> > join pushdown for both ForeignScan case has higher priority),
> > however, it makes sense to keep the future direction in mind.
> >
> > Also, as an aside, even though Hanada-san mentioned ForeignScan
> > does not need an infrastructure to initialize child path nodes,
> > this idea may require ForeignScan to have local child path.
>
> Neat idea. This ties into something I've thought about and mentioned
> before: what if the innerrel is local, but there's a replicated copy
> on the remote server? Perhaps both cases are worth thinking about at
> some point.
>
I think, here is both merit and de-merit for each. It implies either of
them never always-better-strategy.
* Push out local table as VALUES(...) clause
Good: No restriction to functions/operators in the local scan or
underlying plan node.
Bad: High cost for data format modification (HeapTupleSlot =>
VALUES(...) clause in text), and 2-way data transfer.
* Remote join between foreign table and replicated table
Good: Data already exists on remote side, no need to kick out
contents of local relation (and no need to consume CPU
cycle to make VALUES() clause).
Bad: Functions/operators are restricted as existing postgres_fdw
is doing. Only immutable and built-in ones are available to
run on the remote side.
BTW, do we need either of tables being foreign table, if entire database
is (synchronously) replicated?
Also, loopback server may be a candidate even if not replicated (although
it may be an entrance of deadlock heaven).
Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2015-06-05 01:47:12 | Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 |
Previous Message | Amit Kapila | 2015-06-05 01:02:55 | Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file |