Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan

From: Peter Billen <peter(dot)billen(at)gmail(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan
Date: 2019-04-30 18:57:07
Message-ID: CAMTXbE8yTu9Yo=KFcYt2VqAQF_VBEZj4EXuM3JVut0rdFq5SxQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

The queries in what follows can be executed on the following fiddle:
*https://dbfiddle.uk/?rdbms=postgres_10&fiddle=64542f2d987d3ce0d85bbc40ddadf7d6
<https://dbfiddle.uk/?rdbms=postgres_10&fiddle=64542f2d987d3ce0d85bbc40ddadf7d6>*
- Please
note that the queries/functions might look silly/pointless, I extracted the
performance issue I am seeing to a minimal reproducible example.

I have the following schema:

create table parent
(
id int primary key
);
create table child
(
id int primary key,
parent_id int references parent(id)
);
create index on child(parent_id);

Let's start with the following inlinable setof-returning function which
basically returns the child rows for a given parent identifier:

create function child_get_1(int) returns table(a int) as
$$
select child.id
from child
left join parent
on parent.id = child.parent_id
where child.parent_id = $1;
$$
language sql stable;

Note that the left join branch is intentionally not used, and thus could be
eliminated by the planner.

When executing the following query, I get a satisfying hash (left) join
(and the left join to parent is indeed eliminated):

explain analyze
select ch.* from parent p, child_get_1(p.id) ch;

+--------------------------------------------------------------------------------------------------------------------+
| Hash Join (cost=3.25..17163.23 rows=999900 width=4) (actual
time=0.025..194.279 rows=999900 loops=1) |
| Hash Cond: (child.parent_id = p.id)
|
| -> Seq Scan on child (cost=0.00..14424.00 rows=999900 width=8)
(actual time=0.005..47.215 rows=999900 loops=1) |
| -> Hash (cost=2.00..2.00 rows=100 width=4) (actual time=0.016..0.016
rows=100 loops=1) |
| Buckets: 1024 Batches: 1 Memory Usage: 12kB
|
| -> Seq Scan on parent p (cost=0.00..2.00 rows=100 width=4)
(actual time=0.001..0.007 rows=100 loops=1) |
+--------------------------------------------------------------------------------------------------------------------+

Now, I introduce a convenience function, also inlinable, which fetches the
child rows by its parent id:

create function t(int) returns setof child as
$$
select child.* from child where child.parent_id = $1;
$$
language sql stable;

I refactor `child_get_1(int)` from above as following:

create function child_get_2(int) returns table(a int) as
$$
select child.id
from t($1) child
left join parent
on parent.id = child.parent_id;
$$
language sql stable;

explain analyze
select ch.* from parent p, child_get_2(p.id) ch;

Now, I get a nested loop, which as expected performs quite badly:

+--------------------------------------------------------------------------------------------------------------------------------------------+
| Nested Loop (cost=189.92..493990.48 rows=999900 width=4) (actual
time=1.519..713.680 rows=999900 loops=1) |
| -> Seq Scan on parent p (cost=0.00..2.00 rows=100 width=4) (actual
time=0.004..0.081 rows=100 loops=1) |
| -> Bitmap Heap Scan on child (cost=189.92..4739.90 rows=9999 width=4)
(actual time=1.365..6.332 rows=9999 loops=100) |
| Recheck Cond: (parent_id = p.id)
|
| Heap Blocks: exact=442476
|
| -> Bitmap Index Scan on child_parent_id_idx (cost=0.00..187.42
rows=9999 width=0) (actual time=0.838..0.838 rows=9999 loops=100) |
| Index Cond: (parent_id = p.id)
|
+--------------------------------------------------------------------------------------------------------------------------------------------+

For some reason I cannot explain we now end up with a nested loop, instead
an hash join. The fairly trivial introduction of `t(int)` messes up with
reordering, but I fail to see why. I manually upped the from and join
collapse limit to 32 - just to be sure -, but no effect. Also, the left
join branch could not be eliminated. I believe this is related to the usage
of the implicit lateral join to `child_get_2(p.id)` in the main query,
which somehow messes up with the reordering of `from t($1) as child` in
`child_get_2(int)`, though I am not 100% sure.

Also, note that when we apply an inner join instead of a left join, the
problem goes away. The planner now manages to end up with a hash join in
both cases.

I am seeing this on v10 and v11.

Any ideas?

Thank you. Best regards.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2019-04-30 20:38:04 Re: Failure to reordering in case of a lateral join in combination with a left join (not inner join) resulting in suboptimal nested loop plan
Previous Message Naik, Sameer 2019-04-30 04:58:50 RE: Re: Generic Plans for Prepared Statement are 158155 times slower than Custom Plans