Re: MIN/MAX optimization for partitioned table

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Alan Li <ali(at)truviso(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: MIN/MAX optimization for partitioned table
Date: 2009-07-20 19:29:09
Message-ID: 407d949e0907201229s22e375edh219cbe9e2ddd0c67@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 20, 2009 at 7:40 PM, Alan Li<ali(at)truviso(dot)com> wrote:
> Attached is an updated patch that removes the O(n^2) behavior and the
> awkwardness of optimizing the seqscan path as the plan was about to be
> created.  Now, the optimization is considered when appendrel is generating
> the paths.
>
> I also changed the plan as you had suggested.  It now looks like this:

Hm, that's not quite the plan I described either. I had in mind to
mirror the existing min/max optimization which put the whole thing in
an InitPlan and put a Result node in place of the actual plan. Your
original patch did that for each subplan but I thought it would be
better to do it for the whole aggregate.

However the more I think about it the more I don't understand why Tom
arranged to do that for the min/max optimization anyways. For
subqueries where that makes sense that would surely happen anyways
such as in the first example below. And for joins where it's necessary
the planner knows to put a Materialize node which sounds just as good.

Here's what happens with your current patch in the case I was
concerned about -- note that the planner automatically detects the
case and turns the whole subplan into an initplan anyways:

postgres=# explain select * from y where j = (select min(i) from x) ;
QUERY PLAN
----------------------------------------------------------------------------------------------
Seq Scan on y (cost=40.12..80.12 rows=12 width=4)
Filter: (j = $0)
InitPlan 1 (returns $0)
-> Aggregate (cost=40.11..40.12 rows=1 width=4)
-> Append (cost=0.00..34.10 rows=2403 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi on x (cost=0.00..80.25
rows=2400 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi2 on x2 x
(cost=0.00..80.25 rows=2400 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi3 on x3 x
(cost=0.00..80.25 rows=2400 width=4)
-> Seq Scan on x1 x (cost=0.00..34.00 rows=2400 width=4)
(12 rows)

And here's another case where you wouldn't want multiple execution --
but the planner here figures out to materialize the result:

postgres=# explain select * from y left outer join (select min(i) as i
from x) as x on (i=j);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=40.13..128.13 rows=2400 width=8)
Join Filter: ((min(public.x.i)) = y.j)
-> Seq Scan on y (cost=0.00..34.00 rows=2400 width=4)
-> Materialize (cost=40.13..40.14 rows=1 width=4)
-> Aggregate (cost=40.11..40.12 rows=1 width=4)
-> Append (cost=0.00..34.10 rows=2403 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi on x
(cost=0.00..80.25 rows=2400 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi2 on x2 x
(cost=0.00..80.25 rows=2400 width=4)
-> Limit (cost=0.00..0.03 rows=1 width=4)
-> Index Scan using xi3 on x3 x
(cost=0.00..80.25 rows=2400 width=4)
-> Seq Scan on x1 x (cost=0.00..34.00 rows=2400 width=4)
(13 rows)

So I'm a bit puzzled why Tom's min/max optimization bothers with the
whole Initplan/Result business itself anyways.

--
greg
http://mit.edu/~gsstark/resume.pdf

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ron Mayer 2009-07-20 19:33:38 Re: [PATCH] SE-PgSQL/tiny rev.2193
Previous Message Tom Lane 2009-07-20 19:25:25 Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex