From: | Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at> |
---|---|
To: | "Robert Haas *EXTERN*" <robertmhaas(at)gmail(dot)com> |
Cc: | Ashutosh Bapat *EXTERN* <ashutosh(dot)bapat(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Costing foreign joins in postgres_fdw |
Date: | 2015-12-19 20:55:06 |
Message-ID: | A737B7A37273E048B164557ADEF4A58B53789295@ntex2010i.host.magwien.gv.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas wrote:
>> Maybe, to come up with something remotely realistic, a formula like
>>
>> sum of locally estimated costs of sequential scan for the base table
>> plus count of estimated result rows (times a factor)
>
> Was this meant to say "the base tables", plural?
Yes.
> I think whatever we do here should try to extend the logic in
> postgres_fdw's estimate_path_cost_size() to foreign tables in some
> reasonably natural way, but I'm not sure exactly what that should look
> like. Maybe do what that function currently does for single-table
> scans, and then add all the values up, or something like that. I'm a
> little worried, though, that the planner might then view a query that
> will be executed remotely as a nested loop with inner index-scan as
> not worth pushing down, because in that case the join actually will
> not touch every row from both tables, as a hash or merge join would.
That's exactly what I meant, minus a contribution for the estimated
result set size.
There are cases where a nested loop is faster than a hash join,
but I think it is rare that this is by orders of magnitude.
So I'd say it is a decent rough estimate, and that's the best we can
hope for here, if we cannot ask the remote server.
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2015-12-19 21:24:08 | Re: Making tab-complete.c easier to maintain |
Previous Message | Andres Freund | 2015-12-19 18:42:19 | Re: [sqlsmith] Failing assertions in spgtextproc.c |