Re: BUG #9898: WindowAgg's causing horrific plans

From: bricklen <bricklen(at)gmail(dot)com>
To: jeff(at)jefftrout(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #9898: WindowAgg's causing horrific plans
Date: 2014-04-08 01:00:05
Message-ID: CAGrpgQ8UXzZxxToEbSxk45WTHa6C5GLyXE5rywsGVBzrj-CX9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Apr 7, 2014 at 11:33 AM, <jeff(at)jefftrout(dot)com> wrote:

> The following bug has been logged on the website:
>
> Bug reference: 9898
> Logged by: Jeff Trout
> Email address: jeff(at)jefftrout(dot)com
> PostgreSQL version: 9.3.4
> Operating system: OSX Mavericks
> Description:
>
> So I've encountered an issue present in PG 9.1, 9.2 and 9.3.
>
> Before I get going, I'll note that in
> http://www.postgresql.org/message-id/32583.1384469029@sss.pgh.pa.us
> Tom saying there is no optimization around windowagg's, which is the heart
> of
> my issue. I have some hopes perhaps there's some low hanging fruit...
>
> A query is involved with a WindowAgg can cause a
> very bad plan to be used instead of an optimial one.
>
> The original table I encountered this behavior on has about 9M rows.
> There are indexes on both datefiled and recieved (timestamps). I tried an
> index
> on datefiled, recieved but it did not do anything.
>
> First query:
> explain analyze
> select xx
> from mytable e
> where
> e.datefiled > current_day() - '30 days'::interval
> order by received desc
> limit 50;
>
> I get a perfectly sensible plan:
>
> Limit (cost=0.00..358.46 rows=50 width=12) (actual time=1.461..1.582
> rows=50 loops=1)
> -> Index Scan Backward using mytable_received_idx on mytable e
> (cost=0.00..438183.50 rows=61121 width=12) (actual time=1.459..1.572
> rows=50
> loops=1)
> Filter: (datefiled > '2014-03-08 00:00:00'::timestamp without time
> zone)
> Rows Removed by Filter: 27
> Total runtime: 3.818 ms
> (5 rows)
>
> Now, lets throw a window agg in there -
> explain analyze
> select xx, count(*) over ()
> from mytable e
> where
> e.datefiled > current_day() - '30 days'::interval
> order by received desc
> limit 50;
>
> (the count is to return total results, to display paging info)
>
> and we get this plan:
>
> Limit (cost=0.00..359.08 rows=50 width=12) (actual
> time=79088.099..79088.131 rows=50 loops=1)
> -> WindowAgg (cost=0.00..438947.51 rows=61121 width=12) (actual
> time=79088.095..79088.116 rows=50 loops=1)
> -> Index Scan Backward using mytable_received_idx on mytable e
> (cost=0.00..438183.50 rows=61121 width=12) (actual time=0.022..79060.177
> rows=55949 loops=1)
> Filter: (datefiled > '2014-03-08 00:00:00'::timestamp
> without
> time zone)
> Rows Removed by Filter: 9221863
> Total runtime: 79104.066 ms
>

Does the following query change your plan noticeably?

explain analyze
select xx, count(*) over ()
from (
select xx
from mytable e
where
e.datefiled > current_day() - '30 days'::interval
order by received desc
limit 50) y;

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2014-04-08 01:35:27 Re: BUG #9894: SQLError() is not returns SQL_NO_DATA_FOUND
Previous Message Michael Paquier 2014-04-08 00:22:12 Re: BUG #9895: Duplicate pkey