Re: BUG #14302: SQL with LIMIT degrades performance seriously

From: Kaijiang Chen <chenkaijiang(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14302: SQL with LIMIT degrades performance seriously
Date: 2016-08-31 04:02:20
Message-ID: CAAkGvS8HTQ6mSNiRF4U5w8o_=PvL34tdsBp3kY6GLRrQit0WFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thank you very much for your quick response!

So I know I have to deal with my own solutions. Fortunately, I got the
solution with the "WITH" clause:
with t as (select * from renren.user_relations where parent_id=846346 order
by user_id)
select * from t LIMIT 10;

which separate the ORDER BY and LIMIT to avoid the classic planning problem.

On Tue, Aug 30, 2016 at 11:45 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Mon, Aug 29, 2016 at 11:48 PM, <chenkaijiang(at)gmail(dot)com> wrote:
>
>
>>
>> the explain result:
>>
>> explain select * from renren.user_relations where parent_id=846346 order
>> by
>> user_id limit 10;
>>
>> QUERY PLAN
>> ------------------------------------------------------------
>> -------------------------------------------------------
>> Limit (cost=4.57..442.35 rows=10 width=102)
>> -> Merge Append (cost=4.57..496534.92 rows=11342 width=102)
>> Sort Key: user_relations.user_id
>>
> ...
>
>>
>> It uses the Index Scan using index on user_id, which is very stupid.
>>
>
> This a classic planning problem with ORDER BY...LIMIT. Probably parent_id
> is correlated with user_id, and if you pick a high value of parent_id then
> you are implicitly getting high values of user_id. But PostgreSQL doesn't
> know that, it assumes things with parent_id=846346 are randomly dispersed
> over the user_id values, and so it will gather 10 of them very quickly by
> walking the indexes in order.
>
>
>>
>> If I explain select * from renren.user_relations where parent_id=846346
>> order by user_id, then it uses the index on parent_id to get records and
>> then sort it, which is very wise since the number of qualified records is
>> 1725.
>>
>
> You know it is 1725, but PostgreSQL thinks it is 11342. Is autoanalyze
> analyzing often enough? Is default_statistics_target high enough?
> (Although if I'm right about the correlation between parent_id and
> user_id, then fixing that estimate might still not be enough to fix things).
>
>
>> So, I think the optimizer/planner has a performance bug with LIMIT clause.
>>
>
>
> Well, it has to make decisions with the information available to it. That
> is not really a bug. It is constantly being improved, but will never be
> perfect.
>
> Cheers,
>
> Jeff
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kaijiang Chen 2016-08-31 04:09:50 Re: BUG #14302: SQL with LIMIT degrades performance seriously
Previous Message Bruce Momjian 2016-08-30 15:54:04 Re: BUG #14304: WAL files pg_upgrade