Re: Horizontal scalability/sharding

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Josh Berkus <josh(at)agliodbs(dot)com>, Mason S <masonlists(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Horizontal scalability/sharding
Date: 2015-09-02 17:59:55
Message-ID: CAHyXU0w+6TaLK9nKE1AmHimfmPwm1FQbJbRjyFLUZYg2bCGfRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 1, 2015 at 11:18 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> It would be a bad idea to cling blindly to the FDW infrastructure if
> it's fundamentally inadequate to do what we want. On the other hand,
> it would also be a bad idea to set about recreating it without a
> really good reason, and - just to take one example - the fact that it
> doesn't currently push down DML operations to the remote side is not a
> really good reason to rewrite the whole thing. On the contrary, it's
> a reason to put some energy into the already-written patch which
> implements that optimization.

The problem with FDW for these purposes as I see it is that too much
intelligence is relegated to the implementer of the API. There needs
to be a mechanism so that the planner can rewrite the remote query and
then do some after the fact processing. This exactly what citus does;
if you send out AVG(foo) it rewrites that to SUM(foo) and COUNT(foo)
so that aggregation can be properly weighted to the result. To do
this (distributed OLAP-type processing) right, the planner needs to
*know* that this table is in fact distributed and also know that it
can make SQL compatible adjustments to the query.

This strikes me as a bit of a conflict of interest with FDW which
seems to want to hide the fact that it's foreign; the FDW
implementation makes it's own optimization decisions which might make
sense for single table queries but breaks down in the face of joins.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-09-02 18:03:56 Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)
Previous Message Josh Berkus 2015-09-02 17:57:30 Re: Horizontal scalability/sharding