| From: | bricklen <bricklen(at)gmail(dot)com> | 
|---|---|
| To: | jeff(at)jefftrout(dot)com | 
| Cc: | pgsql-bugs(at)postgresql(dot)org | 
| Subject: | Re: BUG #9898: WindowAgg's causing horrific plans | 
| Date: | 2014-04-08 01:00:05 | 
| Message-ID: | CAGrpgQ8UXzZxxToEbSxk45WTHa6C5GLyXE5rywsGVBzrj-CX9w@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-bugs | 
On Mon, Apr 7, 2014 at 11:33 AM, <jeff(at)jefftrout(dot)com> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      9898
> Logged by:          Jeff Trout
> Email address:      jeff(at)jefftrout(dot)com
> PostgreSQL version: 9.3.4
> Operating system:   OSX Mavericks
> Description:
>
> So I've encountered an issue present in PG 9.1, 9.2 and 9.3.
>
> Before I get going, I'll note that in
> http://www.postgresql.org/message-id/32583.1384469029@sss.pgh.pa.us
> Tom saying there is no optimization around windowagg's, which is the heart
> of
> my issue. I have some hopes perhaps there's some low hanging fruit...
>
> A query is involved with a WindowAgg can cause a
> very bad plan to be used instead of an optimial one.
>
> The original table I encountered this behavior on has about 9M rows.
> There are indexes on both datefiled and recieved (timestamps). I tried an
> index
> on datefiled, recieved but it did not do anything.
>
> First query:
> explain analyze
> select xx
> from mytable e
> where
>                  e.datefiled > current_day() - '30 days'::interval
>                 order by received desc
>                 limit 50;
>
> I get a perfectly sensible plan:
>
>  Limit  (cost=0.00..358.46 rows=50 width=12) (actual time=1.461..1.582
> rows=50 loops=1)
>    ->  Index Scan Backward using mytable_received_idx on mytable e
> (cost=0.00..438183.50 rows=61121 width=12) (actual time=1.459..1.572
> rows=50
> loops=1)
>          Filter: (datefiled > '2014-03-08 00:00:00'::timestamp without time
> zone)
>          Rows Removed by Filter: 27
>  Total runtime: 3.818 ms
> (5 rows)
>
> Now, lets throw a window agg in there -
> explain analyze
> select xx, count(*) over ()
> from mytable e
> where
>                  e.datefiled > current_day() - '30 days'::interval
>                 order by received desc
>                 limit 50;
>
> (the count is to return total results, to display paging info)
>
> and we get this plan:
>
>  Limit  (cost=0.00..359.08 rows=50 width=12) (actual
> time=79088.099..79088.131 rows=50 loops=1)
>    ->  WindowAgg  (cost=0.00..438947.51 rows=61121 width=12) (actual
> time=79088.095..79088.116 rows=50 loops=1)
>          ->  Index Scan Backward using mytable_received_idx on mytable e
> (cost=0.00..438183.50 rows=61121 width=12) (actual time=0.022..79060.177
> rows=55949 loops=1)
>                Filter: (datefiled > '2014-03-08 00:00:00'::timestamp
> without
> time zone)
>                Rows Removed by Filter: 9221863
>  Total runtime: 79104.066 ms
>
Does the following query change your plan noticeably?
explain analyze
select xx, count(*) over ()
from (
    select xx
    from mytable e
    where
    e.datefiled > current_day() - '30 days'::interval
    order by received desc
    limit 50) y;
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2014-04-08 01:35:27 | Re: BUG #9894: SQLError() is not returns SQL_NO_DATA_FOUND | 
| Previous Message | Michael Paquier | 2014-04-08 00:22:12 | Re: BUG #9895: Duplicate pkey |