Re: limit is sometimes not pushed in view with order

From: Rikard Pavelic <rikard(at)ngs(dot)hr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: limit is sometimes not pushed in view with order
Date: 2013-04-13 18:08:16
Message-ID: 20130413200816.00007f1b@unknown
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 13 Apr 2013 11:21:19 -0400
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The planner realizes that about 1/200th of the table satisfies the
> "ord" condition, so it estimates that the first plan will require
> scanning about 2000 entries in the pkey index to get 10 results. So
> that looks significantly cheaper than the other plan, which would
> require 5000 index fetches, not to mention a sort step.

Is it really realizing that?
Maybe I oversimplified my example.
I was trying to show that planner is not using limit information when
it should. For example if there were joins, he would do joins first
using filter estimate and at the end apply limit.

>
> Now, in this artificial test case, the cost estimate is wrong because
> "i" and "ord" are perfectly correlated and all the desired rows are
> quite far down the descending-i index scan; so the chosen plan
> actually has to scan a lot more than 2000 index entries. In a more
> realistic case that plan would probably work noticeably better.
> However, the planner has no statistics that would tell it about the
> degree of order correlation of the two columns, so it's not able to
> find that out.
>

Thats actually pretty realistic scenario (if you change ord int to
created timestamp), but yeah it's probably best to create composite
index for that scenario.

But, look at this standard ERP example:

create table main_table (id serial primary key, data varchar, created timestamptz);
create table detail_table (main_id int references main_table, index int, data varchar, primary key(main_id, index));

create view main_view as
select m.id, m.data, m.created, d.details
from main_table m
left join
(
select main_id, array_agg(d order by index) details
from detail_table d
group by main_id
) d on m.id = d.main_id
order by m.created desc;

insert into main_table select i, i::text, now() + (i/10 || ' sec')::interval from generate_series(1,100001) i;
insert into detail_table select i/5 + 1, i%5, i::text from generate_series(1,500000) i;

create index ix_created on main_table(created desc);
analyze main_table;
analyze detail_table;

explain analyze select * from main_view m where m.created >= now() + '1 min'::interval and m.created <= now() + '5 min'::interval limit 10;

Limit (cost=0.01..22913.81 rows=10 width=49) (actual time=35.548..920.034 rows=10 loops=1)
-> Nested Loop Left Join (cost=0.01..5879654.29 rows=2566 width=49) (actual time=35.546..920.028 rows=10 loops=1)
Join Filter: (m.id = d.main_id)
Rows Removed by Join Filter: 904978
-> Index Scan using ix_created on main_table m (cost=0.01..98.79 rows=2521 width=17) (actual time=0.059..0.103 rows=10 loops=1)
Index Cond: ((created >= (now() + '00:01:00'::interval)) AND (created <= (now() + '00:05:00'::interval)))
-> Materialize (cost=0.00..25343.93 rows=101773 width=36) (actual time=0.012..84.037 rows=90499 loops=10)
-> Subquery Scan on d (cost=0.00..24039.07 rows=101773 width=36) (actual time=0.036..630.576 rows=100001 loops=1)
-> GroupAggregate (cost=0.00..23021.34 rows=101773 width=46) (actual time=0.035..619.240 rows=100001 loops=1)
-> Index Scan using detail_table_pkey on detail_table d (cost=0.00..19249.17 rows=500000 width=46) (actual time=0.012..154.834 rows=500000 loops=1)
Total runtime: 922.272 ms

While one could argue that optimizer doesn't know to optimize left
join with group by its primary key, you can replace that join with
some other joins (ie left join to another table pk) and the same
behavior will be displayed (joining all tables and applying limit at
the end).
That's why I asked if fence for pushing limit is a known behavior.

Since this behavior is really important to me, I will spend a lot of
time looking at Postgres to try and improve this.

Regards,
Rikard

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2013-04-13 19:51:48 Re: Segment best size
Previous Message Rodrigo Barboza 2013-04-13 17:29:13 Re: Segment best size