Re: Slow plan for MAX/MIN or LIMIT 1?

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Sam Wong <sam(at)hellosam(dot)net>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow plan for MAX/MIN or LIMIT 1?
Date: 2013-09-24 21:49:10
Message-ID: CAHyXU0yt5xEgC-zZ3UmZqXhH2FKH9EJpQ8xC+=6VwLAjy4OSwg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Sep 24, 2013 at 4:24 AM, Sam Wong <sam(at)hellosam(dot)net> wrote:
> Hi There,
>
>
>
> I have hit a query plan issue that I believe is a bug or under-estimation,
> and would like to know if there it is known or if there is any workaround…
>
>
>
> This event_log table has 4 million rows.
>
> “log_id” is the primary key (bigint),
>
> there is a composite index “event_data_search” over (event::text,
> insert_time::datetime).
>
>
>
> Query A:
>
> SELECT min(log_id) FROM event_log
>
> WHERE event='S-Create' AND
>
> insert_time>'2013-09-15' and insert_time<'2013-09-16'
>
>
>
> Query B:
>
> SELECT log_id FROM event_log
>
> WHERE event='S-Create' AND
>
> insert_time>'2013-09-15' and insert_time<'2013-09-16'
>
> ORDER BY log_id
>
>
>
> What I want to achieve is Query A – get the min log_id within a range. But
> it is very slow, taking 10 or 20 seconds.
>
> If I don’t limit the output to LIMIT 1 – like Query B – then it is
> sub-second fast.
>
>
>
> Explain of A – take 10~20 seconds to run
>
> Limit (cost=0.00..132.54 rows=1 width=8)
>
> -> Index Scan using event_log_pkey on event_log (cost=0.00..1503484.33
> rows=11344 width=8)
>
> Filter: ((insert_time > '2013-09-15 00:00:00'::timestamp without
> time zone) AND (insert_time < '2013-09-16 00:00:00'::timestamp without time
> zone) AND (event = 'S-Create'::text))
>
>
>
> Explain of B – take a few milliseconds to run
>
> Sort (cost=41015.85..41021.52 rows=11344 width=8)
>
> Sort Key: log_id
>
> -> Bitmap Heap Scan on event_log (cost=311.42..40863.05 rows=11344
> width=8)
>
> Recheck Cond: ((event = 'S-Create'::text) AND (insert_time >
> '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time <
> '2013-09-16 00:00:00'::timestamp without time zone))
>
> -> Bitmap Index Scan on event_data_search (cost=0.00..310.86
> rows=11344 width=0)
>
> Index Cond: ((event = 'S-Create'::text) AND (insert_time >
> '2013-09-15 00:00:00'::timestamp without time zone) AND (insert_time <
> '2013-09-16 00:00:00'::timestamp without time zone))
>
>
>
> Plan of A thought that the index scan node will get the first row right at
> 0.00, and hence the limit node will get all the rows needed within 132.54
> (because event_log_pkey are sorted)
>
> I would like to point out that – this 0.00 estimation omits the fact that it
> actually takes a much longer time for the index scan node to get the first
> row, because 3.99M rows it comes across won’t meet the condition filter at
> all.

I think you got A and B mixed up there. Can you post explain analyze
(not just 'explain'){ of the slow plan?

merlin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2013-09-24 21:56:37 Re: Slow plan for MAX/MIN or LIMIT 1?
Previous Message François Deliège 2013-09-24 18:52:58 Bringing up new slaves faster