From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: OUTER JOIN performance regression remains in 8.3beta4 |
Date: | 2008-01-09 17:13:40 |
Message-ID: | 7704.1199898820@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> "Alvaro Herrera" <alvherre(at)commandprompt(dot)com> writes:
>> Would it be a good idea to keep removing redundant clauses and rethink
>> the preference for clauseful joins, going forward?
> I don't understand what's going on here. The planner is choosing one join
> order over another because one join has more join clauses than the
> other?
Not more join clauses, but any join clause at all. We will not explore
join paths that don't have any join clause, unless forced to by lack of
any other way to form the result.
> Even
> though some of those joins are entirely redundant and have no selectivity?
You're confusing whether we explore a path (ie, cost it out) with
whether we choose it. It's a necessary precondition, of course,
but we won't pick the path unless it looks cheapest.
Not exploring clauseless join paths is a heuristic that's needed to
avoid exponential growth of the search space in large join problems.
AFAIK every System-R-derived planner has done this.
As an example, consider
t1 join t2 on (...) join t3 on (...) ... join t8 on (...)
and for simplicity suppose that each ON condition relates the new
table to the immediately preceding table, and that we can't derive
any additional join conditions through transitivity. In this situation
there are going to be only seven ways to form a two-base-relation
joinrel, as long as we allow only clauseful joins. But there are
8*7/2 = 28 distinct ways to form a join if we consider all possible
join pairs whether they have a join clause or not. At the
three-base-relation level there will be 12 joinrels if we only consider
clauseful pairs, or 56 if we don't. It gets worse as you go up, and
most if not all of those additional joinrels represent entirely useless
variations on the theme of "let's stupidly compute a cartesian product
and then winnow it sometime later".
This is not to say that there is never a case where an early cartesian
product couldn't be a useful part of a plan, but rejecting them is a
darn good heuristic.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Browne | 2008-01-09 17:20:58 | Re: Named vs Unnamed Partitions |
Previous Message | Simon Riggs | 2008-01-09 17:06:15 | Re: Named vs Unnamed Partitions |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2008-01-09 17:22:32 | Re: BUG #3860: xpath crashes backend when is querying xmlagg result |
Previous Message | Gregory Stark | 2008-01-09 16:47:51 | Re: OUTER JOIN performance regression remains in 8.3beta4 |