Inefficient plan on 10.4

From: "Peter J(dot) Holzer" <hjp-pgsql(at)hjp(dot)at>
To: pgsql-general(at)postgresql(dot)org
Subject: Inefficient plan on 10.4
Date: 2018-07-05 13:17:51
Message-ID: 20180705131751.so2x3xxemmtr24nj@hjp.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have this table:

Table "public.relation"
Column | Type | Collation | Nullable | Default
-------------------+-----------------------------+-----------+----------+--------------------------------------
parent | integer | | not null |
child | integer | | not null |
type | character varying | | |
sortorder | integer | | |
valid_from | timestamp without time zone | | not null | now()
from_job_queue_id | integer | | |
id | integer | | not null | nextval('relation_id_seq'::regclass)
Indexes:
"relation_pkey" PRIMARY KEY, btree (id)
"relation_child_idx" btree (child)
"relation_parent_idx" btree (parent)
Foreign-key constraints:
"relation_child_fkey" FOREIGN KEY (child) REFERENCES concept(id) DEFERRABLE
"relation_parent_fkey" FOREIGN KEY (parent) REFERENCES concept(id) DEFERRABLE

which has about 150 million rows:

wdsah=> select count(*) from relation;
count
-----------
147810590
(1 row)

I'm trying to get to get the siblings of a node (including itself):

select r2.parent, r2.type, r2.child
from relation r1, relation r2
where r1.child=643541 and r2.parent=r1.parent
order by r2.type

This worked fine on 9.5, but on 10.4 it takes several seconds:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Merge (cost=12032221.68..25195781.12 rows=112822632 width=15) (actual time=4086.255..4086.257 rows=2 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=12031221.66..12172249.95 rows=56411316 width=15) (actual time=4080.862..4080.862 rows=1 loops=3)
Sort Key: r2.type
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=0.57..1876228.04 rows=56411316 width=15) (actual time=3036.547..4080.826 rows=1 loops=3)
-> Parallel Seq Scan on relation r1 (cost=0.00..1856722.83 rows=1 width=4) (actual time=3036.525..4080.802 rows=0 loops=3)
Filter: (child = 643541)
Rows Removed by Filter: 49270196
-> Index Scan using relation_parent_idx on relation r2 (cost=0.57..17041.69 rows=246351 width=15) (actual time=0.053..0.055 rows=2 loops=1)
Index Cond: (parent = r1.parent)
Planning time: 0.418 ms
Execution time: 4090.442 ms

If I disable sequential scans, it works fine again:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=25246497.92..25584965.81 rows=135387158 width=15) (actual time=0.119..0.119 rows=2 loops=1)
Sort Key: r2.type
Sort Method: quicksort Memory: 25kB
-> Nested Loop (cost=1.14..19513.79 rows=135387158 width=15) (actual time=0.101..0.104 rows=2 loops=1)
-> Index Scan using relation_child_idx on relation r1 (cost=0.57..8.59 rows=1 width=4) (actual time=0.079..0.080 rows=1 loops=1)
Index Cond: (child = 643541)
-> Index Scan using relation_parent_idx on relation r2 (cost=0.57..17041.69 rows=246351 width=15) (actual time=0.018..0.020 rows=2 loops=1)
Index Cond: (parent = r1.parent)
Planning time: 0.446 ms
Execution time: 0.165 ms

There are several points about these plans that I don't understand:

* Why does the nested loop expect 56E6 or even 135E6 rows? It expects 1
row for the outer table and then 246351 rows in the inner table for
each of them. 1 * 246351 == 246351. So it should expect 246351 rows.
(246351 itself is way too high, but the table has a very skewed
distribution, and I've already set the statistics target to the
maximum of 10000, so there's not much I can do about that)

* Why does the Parallel Seq Scan report actual rows=0? It did return 1
row (or is that the average per worker? That would be < 1, and
probably rounded down to 0)

hp

PS: The obvious workaround is to remove "order by r2.type". I can easily
get the required partial order in the application. But I'd like to
understand what the optimizer is doing here.

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp(at)hjp(dot)at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2018-07-05 13:18:11 Re: FK v.s unique indexes
Previous Message Tom van Tilburg 2018-07-05 12:44:31 Re: Postgres sometimes stalling on 'percentile_cont'