Re: pass-through queries to foreign servers

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Gudeman <dave(dot)gudeman(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, Postgres <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pass-through queries to foreign servers
Date: 2013-08-12 04:37:00
Message-ID: CAFjFpRei2A+bm0Wr5Kcg0L6EUwNjF+6dZVehNbABiShsOA3vUg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 6, 2013 at 12:51 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Gudeman <dave(dot)gudeman(at)gmail(dot)com> writes:
> > For those who don't want to go to the link to see what I'm talking
> > about with query rewrites, I thought I'd give a brief description.
> > Foreign data wrappers currently do all of their work in the planning
> > phase but I claim that isn't the right place to optimize foreign
> > queries with aggregates and GROUP BY because optimizing those things
> > would involve collapsing multiple plan node back into a single node
> > for a foreign call.
>
> I'm not sure what the best implementation for that is, but what you
> propose here would still involve such collapsing, so this argument
> seems rather empty.
>
> > I propose to do these optimizations as query
> > rewrites instead. So for example suppose t is a foreign table on the
> > foreign server named fs. Then the query
>
> > SELECT count(*) FROM t
>
> > is rewritten to
>
> > SELECT count FROM fs('select count(*) from t') fs(count bigint)
>
> > where ts() is the pass-through query function for the server fs. To
> > implement this optimization as a query rewrite, all of the elements of
> > the result have to be real source-language constructs so the
> > pass-through query has to be available in Postgresql SQL.
>
> I don't believe in any part of that design, starting with the "pass
> through query function". For one thing, it seems narrowly targeted to the
> assumption that the FDW is a frontend for a foreign server that speaks
> SQL. If the FDW's infrastructure doesn't include some kind of textual
> query language, this isn't going to be useful for it at all. For another,
> a query rewrite system is unlikely to be able to cost out the alternatives
> and decide whether pushing the aggregation across is actually a win or
> not.
>
> The direction I think we ought to be heading is to generate explicit Paths
> representing the various ways in which aggregation can be implemented.
> The logic in grouping_planner is already overly complex, and hard to
> extend, because it's all hard-wired comparisons of alternatives. We'd be
> better off with something more like the add_path infrastructure. Once
> that's been done, maybe we can allow FDWs to add Paths representing remote
> aggregation.
>
>
Postgres-XC has extended the current PostgreSQL planner to find out the
largest subset of join tree that can be evaluated on the server where the
data is (called the Datanode in XC jargon). If it finds that the whole of
join tree can be evaluated on the Datanode/s, it also attempts to evaluate
the grouped aggregates (sometime partially). Same is the case with ORDER
BY, LIMIT clauses. An alternate method called fast-query-shipping is used
to avoid planning and pass the entire query to the Datanode/s if the query
can be completely evaluated at the Datanode/s. These two techniques
eliminate the need of pass-through syntax in XC.

But, XC planner currently has these extensions 1. without real cost
estimations (since in XC assumption is that the query perform better if
evaluated on the Datanodes, which itself is not right in some cases.) 2.
Right now it works only for PostgreSQL (but can be extended easily for all
SQL based databases).

It might be worth to look at the XC planner and pick up pieces of work that
fit in PostgreSQL.

regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2013-08-12 05:26:48 Server crash when using bgw_main for a dynamic bgworker
Previous Message Greg Stark 2013-08-11 21:04:14 Re: killing pg_dump leaves backend process