BUG #14302: SQL with LIMIT degrades performance seriously

From: chenkaijiang(at)gmail(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14302: SQL with LIMIT degrades performance seriously
Date: 2016-08-30 06:48:58
Message-ID: 20160830064858.15676.40929@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14302
Logged by: Kaijiang Chen
Email address: chenkaijiang(at)gmail(dot)com
PostgreSQL version: 9.5.3
Operating system: CentOS 7.2
Description:

I have a table, renren.user_relations with 10 partitions and contains about
10M records.

I created 2 indexes on it (including parent table and all partitions):
1) user_relations_pkey -- btree index on user_id column
2) user_relations_0_parent_id_idx -- btree index on parent_id

I ran the SQL "select * from renren.user_relations where parent_id=846346
order by user_id limit 10;" and it took 1.4 sec, which is slow.
But "select * from renren.user_relations where parent_id=846346 order by
user_id" (no LIMIT) and it took 100+ ms.

the explain result:

explain select * from renren.user_relations where parent_id=846346 order by
user_id limit 10;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Limit (cost=4.57..442.35 rows=10 width=102)
-> Merge Append (cost=4.57..496534.92 rows=11342 width=102)
Sort Key: user_relations.user_id
-> Index Scan using user_relations_pkey on user_relations
(cost=0.12..5.24 rows=1 width=27)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_0_pkey on user_relations_0
(cost=0.42..49426.24 rows=862 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_1_pkey on user_relations_1
(cost=0.42..49437.21 rows=1022 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_2_pkey on user_relations_2
(cost=0.42..49681.23 rows=973 width=103)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_3_pkey on user_relations_3
(cost=0.42..49427.38 rows=990 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_4_pkey on user_relations_4
(cost=0.42..49713.95 rows=941 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_5_pkey on user_relations_5
(cost=0.42..49711.65 rows=1687 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_6_pkey on user_relations_6
(cost=0.42..49676.60 rows=1104 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_7_pkey on user_relations_7
(cost=0.42..49715.27 rows=1168 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_8_pkey on user_relations_8
(cost=0.42..49698.55 rows=1168 width=102)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_9_pkey on user_relations_9
(cost=0.42..49620.69 rows=1426 width=103)
Filter: (parent_id = 846346)
(25 rows)

Time: 1.344 ms

It uses the Index Scan using index on user_id, which is very stupid.

If I explain select * from renren.user_relations where parent_id=846346
order by user_id, then it uses the index on parent_id to get records and
then sort it, which is very wise since the number of qualified records is
1725.

Finally, I got a way to work around:

explain with tmp as (select * from renren.user_relations where
parent_id=846346 order by user_id) select user_id from tmp limit 10;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=23065.24..23065.44 rows=10 width=8)
CTE tmp
-> Sort (cost=23036.89..23065.24 rows=11342 width=102)
Sort Key: user_relations.user_id
-> Append (cost=0.00..22273.04 rows=11342 width=102)
-> Seq Scan on user_relations (cost=0.00..0.00 rows=1
width=27)
Filter: (parent_id = 846346)
-> Index Scan using user_relations_0_parent_id_idx on
user_relations_0 (cost=0.42..1682.49 rows=862 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_1_parent_id_idx on
user_relations_1 (cost=0.42..2156.02 rows=1022 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_2_parent_id_idx on
user_relations_2 (cost=0.42..1887.71 rows=973 width=103)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_3_parent_id_idx on
user_relations_3 (cost=0.42..2035.95 rows=990 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_4_parent_id_idx on
user_relations_4 (cost=0.42..1805.29 rows=941 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_5_parent_id_idx on
user_relations_5 (cost=0.42..3240.29 rows=1687 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_6_parent_id_idx on
user_relations_6 (cost=0.42..2149.10 rows=1104 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_7_parent_id_idx on
user_relations_7 (cost=0.42..2256.53 rows=1168 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_8_parent_id_idx on
user_relations_8 (cost=0.42..2304.51 rows=1168 width=102)
Index Cond: (parent_id = 846346)
-> Index Scan using user_relations_9_parent_id_idx on
user_relations_9 (cost=0.42..2755.16 rows=1426 width=103)
Index Cond: (parent_id = 846346)
-> CTE Scan on tmp (cost=0.00..226.84 rows=11342 width=8)
(28 rows)

Time: 1.409 ms

and "with tmp as (select * from renren.user_relations where parent_id=846346
order by user_id) select user_id from tmp limit 10;" took only 100+ ms.

So, I think the optimizer/planner has a performance bug with LIMIT clause.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2016-08-30 08:24:03 Re: BUG #14299: initdb and man pages are not installed in the alternatives system
Previous Message Rikard Pavelic 2016-08-29 20:53:04 Re: BUG #14301: function in case expression called when it should not be