No optimization with a partition window in a view

From: Pierre <pinaraf(at)pinaraf(dot)info>
To: pgsql-bugs(at)postgresql(dot)org
Subject: No optimization with a partition window in a view
Date: 2014-02-15 08:04:01
Message-ID: 6539554.0HoVISsQxY@peanuts2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello

I tried to avoid implementing window functions inside my ORM by using a view,
but it seems the optimizer is missing an obvious optimization and thus doing a
full table scan.

Affected versions : 9.2.4 and 9.3.2 (9.4 not tested yet)

How to reproduce :

=> create table test_history (i serial, piece integer not null, date timestamp
with time zone default now(), location integer not null);

=> create table test_history (i serial, piece integer not null, date timestamp
with time zone default now(), location integer not null);
CREATE TABLE

=> insert into test_history (piece, location) select i, 1 from
generate_series(1, 1000000) i;
INSERT 0 1000000
=> insert into test_history (piece, location) select i, 2 from
generate_series(1, 1000000) i;
INSERT 0 1000000
=> insert into test_history (piece, location) select i, 3 from
generate_series(1, 1000000) i;
^[[AINSERT 0 1000000
=> insert into test_history (piece, location) select i, 4 from
generate_series(1, 1000000, 2) i;
INSERT 0 500000

=> alter table test_history add primary key(i);
ALTER TABLE
=> create index on test_history(piece);
CREATE INDEX

=> select * from test_history where piece = 42;
i | piece | date | location
---------+-------+-------------------------------+----------
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3
(3 rows)

Time: 0.158 ms

=> select *, lag(location, 1) over w, lead(location, 1) over w from test_history
where piece = 42 window w as (partition by piece order by date);
i | piece | date | location | lag | lead
---------+-------+-------------------------------+----------+-----+------
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1 | | 2
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2 | 1 | 3
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3 | 2 |
(3 rows)

Time: 0.203 ms

=> create view test_history_lag_lead as select *, lag(location, 1) over w,
lead(location, 1) over w from test_history window w as (partition by piece order
by date);
CREATE VIEW
=> select * from test_history_lag_lead where piece = 42;
i | piece | date | location | lag | lead
---------+-------+-------------------------------+----------+-----+------
42 | 42 | 2014-02-15 08:52:50.946586+01 | 1 | | 2
1000042 | 42 | 2014-02-15 08:52:56.634685+01 | 2 | 1 | 3
2000042 | 42 | 2014-02-15 08:53:00.762706+01 | 3 | 2 |
(3 rows)

Time: 2915.756 ms

=> explain analyze select * from test_history_lag_lead where piece = 42;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on test_history_lag_lead (cost=653058.16..775558.16 rows=4
width=28) (actual time=1411.423..3375.794 rows=3 loops=1)
Filter: (test_history_lag_lead.piece = 42)
Rows Removed by Filter: 3499997
-> WindowAgg (cost=653058.16..731808.16 rows=3500000 width=20) (actual
time=1411.343..3206.959 rows=3500000 loops=1)
-> Sort (cost=653058.16..661808.16 rows=3500000 width=20) (actual
time=1411.337..1867.895 rows=3500000 loops=1)
Sort Key: test_history.piece, test_history.date
Sort Method: external merge Disk: 116328kB
-> Seq Scan on test_history (cost=0.00..57293.00 rows=3500000
width=20) (actual time=0.004..310.558 rows=3500000 loops=1)
Total runtime: 3386.455 ms
(9 rows)

=> explain analyze select *, lag(location, 1) over w, lead(location, 1) over w
from test_history where piece = 42 window w as (partition by piece order by
date);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=19.70..19.78 rows=4 width=20) (actual time=0.018..0.019 rows=3
loops=1)
-> Sort (cost=19.70..19.71 rows=4 width=20) (actual time=0.015..0.015
rows=3 loops=1)
Sort Key: date
Sort Method: quicksort Memory: 25kB
-> Index Scan using test_history_piece_idx on test_history
(cost=0.43..19.66 rows=4 width=20) (actual time=0.009..0.010 rows=3 loops=1)
Index Cond: (piece = 42)
Total runtime: 0.037 ms
(7 rows)

As you can see, the optimizer decided to do the WindowAgg first on the 3,5M
lines before filtering, instead of filtering using an Index scan and then doing
the aggregation.
IMHO, that optimization would be a special case possible only with windows on a
partition that is contained in the filtering fields, but I aint no expert here
:)

Thanks

Pierre Ducroquet

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Clemens Eisserer 2014-02-15 08:59:25 Re: BUG #9161: wal_writer_delay is limited to 10s
Previous Message Bruce Momjian 2014-02-15 05:07:20 Re: Bad behaviour on some geometric operations on degenerate polygons