Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thom(at)linux(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-04-09 07:50:31
Message-ID: 7415F4B5-4D99-46B2-A926-94019F8CAA45@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kaigai-san,

Thanks for your review.

2015/04/09 10:48、Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> のメール:
* merge_fpinfo()
>>> It seems to me fpinfo->rows should be joinrel->rows, and
>>> fpinfo->width also should be joinrel->width.
>>> No need to have special intelligence here, isn't it?
>>
>>
>> Oops. They are vestige of my struggle which disabled SELECT clause optimization
>> (omit unused columns). Now width and rows are inherited from joinrel. Besides
>> that, fdw_startup_cost and fdw_tuple_cost seem wrong, so I fixed them to use simple
>> summary, not average.
>>
> Does fpinfo->fdw_startup_cost represent a cost to open connection to remote
> PostgreSQL, doesn't it?
>
> postgres_fdw.c:1757 says as follows:
>
> /*
> * Add some additional cost factors to account for connection overhead
> * (fdw_startup_cost), transferring data across the network
> * (fdw_tuple_cost per retrieved row), and local manipulation of the data
> * (cpu_tuple_cost per retrieved row).
> */
>
> If so, does a ForeignScan that involves 100 underlying relation takes 100
> times heavy network operations on startup? Probably, no.
> I think, average is better than sum, and max of them will reflect the cost
> more correctly.

In my current opinion, no. Though I remember that I've written such comments before :P.

Connection establishment occurs only once for the very first access to the server, so in the use cases with long-lived session (via psql, connection pooling, etc.), taking connection overhead into account *every time* seems too pessimistic.

Instead, for practical cases, fdw_startup_cost should consider overheads of query construction and getting first response of it (hopefully it minus retrieving actual data). These overheads are visible in the order of milliseconds. I’m not sure how much is appropriate for the default, but 100 seems not so bad.

Anyway fdw_startup_cost is per-server setting as same as fdw_tuple_cost, and it should not be modified according to the width of the result, so using fpinfo_o->fdw_startup_cost would be ok.

> Also, fdw_tuple_cost introduce the cost of data transfer over the network.
> I thinks, weighted average is the best strategy, like:
> fpinfo->fdw_tuple_cost =
> (fpinfo_o->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_o->fdw_tuple_cost +
> (fpinfo_i->width / (fpinfo_o->width + fpinfo_i->width) * fpinfo_i->fdw_tuple_cost;
>
> That's just my suggestion. Please apply the best way you thought.

I can’t agree that strategy, because 1) width 0 causes per-tuple cost 0, and 2) fdw_tuple_cost never vary in a foreign server. Using fpinfo_o->fdw_tuple_cost (it must be identical to fpinfo_i->fdw_tuple_cost) seems reasonable. Thoughts?

>
>>> * explain output
>>>
>>> EXPLAIN output may be a bit insufficient to know what does it
>>> actually try to do.
>>>
>>> postgres=# explain select * from ft1,ft2 where a = b;
>>> QUERY PLAN
>>> --------------------------------------------------------
>>> Foreign Scan (cost=200.00..212.80 rows=1280 width=80)
>>> (1 row)
>>>
>>> Even though it is not an immediate request, it seems to me
>>> better to show users joined relations and remote ON/WHERE
>>> clause here.
>>>
>>
>> Like this?
>>
>> Foreign Scan on ft1 INNER JOIN ft2 ON ft1.a = ft2.b (cost=200.00..212.80
>> rows=1280 width=80)
>> …
>>
> No. This line is produced by ExplainScanTarget(), so it requires the
> backend knowledge to individual FDW.
> Rather than the backend, postgresExplainForeignScan() shall produce
> the output.

Agreed. Additional FDW output such as “Relations”, “Join type”, and “Join conditions” would be possible.

>
>> It might produce a very long line in a case of joining many tables because it
>> contains most of remote query other than SELECT clause, but I prefer detailed.
>> Another idea is to print “Join Cond” and “Remote Filter” as separated EXPLAIN
>> items.
>>
> It is good, if postgres_fdw can generate relations name involved in
> the join for each level, and join cond/remote filter individually.
>
>> Note that v8 patch doesn’t contain this change yet!
>>
> It is a "nice to have" feature. So, I don't think the first commit needs
> to support this. Just a suggestion in the next step.
>

Agreed.

>
> * implementation suggestion
>
> At the deparseJoinSql(),
>
> + /* print SELECT clause of the join scan */
> + initStringInfo(&selbuf);
> + i = 0;
> + foreach(lc, baserel->reltargetlist)
> + {
> + Var *var = (Var *) lfirst(lc);
> + TargetEntry *tle;
> +
> + if (i > 0)
> + appendStringInfoString(&selbuf, ", ");
> + deparseJoinVar(var, &context);
> +
> + tle = makeTargetEntry((Expr *) copyObject(var),
> + i + 1, pstrdup(""), false);
> + if (fdw_ps_tlist)
> + *fdw_ps_tlist = lappend(*fdw_ps_tlist, copyObject(tle));
> +
> + *retrieved_attrs = lappend_int(*retrieved_attrs, i + 1);
> +
> + i++;
> + }
>
> The tle is a copy of the original target-entry, and var-node is also
> copied one. Why is the tle copied on lappend() again?
> Also, NULL as acceptable as 3rd argument of makeTargetEntry.

Good catch. Fixed.

--
Shigeru HANADA
shigeru(dot)hanada(at)gmail(dot)com

Attachment Content-Type Size
foreign_join_v10.patch application/octet-stream 172.3 KB
unknown_filename text/plain 4 bytes

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-04-09 08:12:20 Re: TABLESAMPLE patch
Previous Message Magnus Hagander 2015-04-09 07:48:28 Re: "rejected" vs "returned with feedback" in new CF app