Re: Postgres_fdw join pushdown - wrong results with whole-row reference

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Date: 2016-06-24 08:29:48
Message-ID: b0149165-5214-2469-1136-f24de6c6322d@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/06/24 15:44, Ashutosh Bapat wrote:
>>
>> I think the proposed idea of applying record::text explicit coercion to a
>> whole-row reference in the IS NOT NULL condition in the CASE WHEN
>> conversion would work as expected as you explained, but I'm concerned that
>> the cost wouldn't be negligible when the foreign table has a lot of columns.
>
> That's right, if the foreign server doesn't optimize the case for IS NOT
> NULL, which it doesn't :)
>
> I am happy to use any cheaper means e.g a function which counts number of
> columns in a record. All we need here is a way to correctly identify when a
> record is null and not null in the way we want (as described upthread). I
> didn't find any quickly. Do you have any suggestions?

I'm now starting to wonder if it would be outright wrong to just use the
alias names of corresponding foreign tables directly for whole-row
references? So, instead of these in target lists of remote queries:

SELECT CASE WHEN (r1.*)::text IS NOT NULL THEN ROW (r1.*) END, ...

Just:

SELECT r1, ...

It seems to produce the correct result. Although, I may be missing
something because CASE WHEN solution seems to me to be deliberately chosen.

In any case, attached patch doing the above did not change the results of
related regression tests (plans obviously did change since they don't
output the CASE WHENs in target lists anymore).

Also see the example below:

create extension postgres_fdw;
create server myserver foreign data wrapper postgres_fdw options (dbname
'postgres', use_remote_estimate 'true');
create user mapping for CURRENT_USER server myserver;

create table t1(a int, b int);
create table t2(a int, b int);

create foreign table ft1(a int, b int) server myserver options (table_name
't1');
create foreign table ft2(a int, b int) server myserver options (table_name
't2');

insert into t1 values (1), (2);
insert into t1 values (null, null);

insert into t2 values (1);
insert into t2 values (1, 2);

explain (costs off, verbose) select t1, t1 is null, t2, t2 is null from
ft1 t1 left join ft2 t2 on (t1.a = t2.a);
QUERY PLAN

---------------------------------------------------------------------------------------------
Foreign Scan
Output: t1.*, (t1.* IS NULL), t2.*, (t2.* IS NULL)
Relations: (public.ft1 t1) LEFT JOIN (public.ft2 t2)
Remote SQL: SELECT r1, r2 FROM (public.t1 r1 LEFT JOIN public.t2 r2 ON
(((r1.a = r2.a))))
(4 rows)

select t1, t1 is null as t1null, t2, t2 is null as t2null from ft1 t1 left
join ft2 t2 on (t1.a = t2.a);
t1 | t1null | t2 | t2null
------+--------+-------+--------
(1,) | f | (1,) | f
(1,) | f | (1,2) | f
(2,) | f | | t
(,) | t | | t
(4 rows))

alter server myserver options (set use_remote_estimate 'false');
analyze;

explain (costs off, verbose) select t1, t1 is null, t2, t2 is null from
ft1 t1 left join ft2 t2 on (t1.a = t2.a);
QUERY PLAN
------------------------------------------------------
Merge Left Join
Output: t1.*, (t1.* IS NULL), t2.*, (t2.* IS NULL)
Merge Cond: (t1.a = t2.a)
-> Sort
Output: t1.*, t1.a
Sort Key: t1.a
-> Foreign Scan on public.ft1 t1
Output: t1.*, t1.a
Remote SQL: SELECT a, b FROM public.t1
-> Sort
Output: t2.*, t2.a
Sort Key: t2.a
-> Foreign Scan on public.ft2 t2
Output: t2.*, t2.a
Remote SQL: SELECT a, b FROM public.t2
(15 rows)

select t1, t1 is null as t1null, t2, t2 is null as t2null from ft1 t1 left
join ft2 t2 on (t1.a = t2.a);
t1 | t1null | t2 | t2null
------+--------+-------+--------
(1,) | f | (1,) | f
(1,) | f | (1,2) | f
(2,) | f | | t
(,) | t | | t
(4 rows)

And produces the correct result for Rushabh's case.

Thoughts?

Thanks,
Amit

Attachment Content-Type Size
whole-row-var-deparse-1.patch text/x-diff 19.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-06-24 08:38:20 Re: Postgres_fdw join pushdown - wrong results with whole-row reference
Previous Message Haroon Muhammad 2016-06-24 07:19:25 Re: initdb issue on 64-bit Windows - (Was: [pgsql-packagers] PG 9.6beta2 tarballs are ready)