From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | shigeo Hirose <shigeo(dot)hirose(at)toshiba(dot)co(dot)jp>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: How can I pushdown of functions used in targetlist with FDW ? |
Date: | 2020-01-09 17:31:43 |
Message-ID: | 3239.1578591103@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Stephen Frost <sfrost(at)snowman(dot)net> writes:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> There is, AFAIK, no provision for that. There's not a lot of
>> reason to consider adding it either, because there's no reason
>> to suppose that the remote node can run such a function any
>> faster than the local node.
> Also, I don't think I agree with this particular position- it's entirely
> possible that the remote server is much less loaded/busy than the local
> one and therefore it'd be beneficial, overall, to run that function on
> the remote system and not the local one.
Or the reverse. We have no way of estimating such effects.
> The function might even have
> side-effects or use other objects in the system to run.
In such a case, we *can't* be pushing down. The entire foundational
principle of this behavior is that we must be certain that a pushed-down
operation has exactly the same semantics on either node. Otherwise,
the planner's choices are not optimizations but query behavior changes,
and we can't have that. If you've got a function like that to run
on the remote end, you have to embed it in a remote view -- we simply
don't have another option.
> The function
> may also return a much smaller result than pulling back the raw data (of
> course, the opposite could also possibly be true).
Yeah, this is a legitimate point, but again we have no very good way
of estimating which is better.
In general, there's an awful lot of postgres_fdw's behavior that depends
on the assumption that the remote and local servers are pretty
interchangeable, not least that we take the remote's cost numbers at
face value when preparing cost numbers for a foreign scan. The only
consideration that we really can reliably optimize push-down choices with
is trying to reduce the volume of data transmitted, and we do that without
consideration for whether the operations pushed across might take more or
less time when run on the other server. Maybe someday that could be
improved, but it seems like a nontrivial research project involving a
lot more moving parts than just this point.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter | 2020-01-09 18:18:22 | 12.1 not useable: clientlib fails after a dozen queries (GSSAPI ?) |
Previous Message | gilberto.castillo | 2020-01-09 17:22:25 | Re: [MASSMAIL]Re: partitioned table |