From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Amit Kapila <amit(dot)kapila(at)huawei(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Improving our clauseless-join heuristics |
Date: | 2012-04-16 20:35:27 |
Message-ID: | 4920.1334608527@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Amit Kapila <amit(dot)kapila(at)huawei(dot)com> writes:
> For this kind of query, currently (referring 9.0.3 code) also it considers
> join of b,c and b,d.
> As there is no join clause between b,c,d so it will go in path of
> make_rels_by_clauseless_joins() where it considers join of b,c and b,d.
> In this kind of query, if the suggestion by me in below mail is followed,
> then it will consider joining a,b a,c a,d at level-2 in function
> make_rels_by_clause_joins() which it currently doesn't do which may generate
> useless join paths.
> However in real-world scenario's this kind of queries where 2 cols of
> different tables are
> used in one side expression (b.y + c.z) of where clause may be less.
> On the other hand, when we come to consider d, it will have no join
> clauses so we will consider joining it to each other rel in turn.
> When it come to consider d, as at level -2 it only consider later rels. So
> it should not consider joining with each other rel.
I might still be misunderstanding, but I think what you are suggesting
is that in the loop in make_rels_by_clause_joins, if we find that the
old_rel doesn't have a join clause/restriction with the current
other_rel, we check to see whether other_rel has any join clauses at
all, and force the join to occur anyway if it doesn't.
I can't really get excited about doing it that way instead of the
current way. In the first place, it seems to miss the need to
clauseless-join two relations when neither of them have any join
clauses, for instance plain old "SELECT * FROM a, b". So you still need
something like the make_rels_by_clauseless_joins code path, and it's
not entirely clear how to avoid duplicated work there. In the second
place, instead of N tests to see whether a rel lacks any join clauses,
we'd now have something approaching O(N^2) such tests, in the typical
case where most base rels are directly joined to only a few other rels.
So it seems to make things slower for little obvious benefit.
In general, queries with join-clause-less rels are pretty uncommon,
so I don't want to introduce extra work into make_rels_by_clause_joins
to handle the case.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2012-04-16 20:50:36 | Re: 9.3 Pre-proposal: Range Merge Join |
Previous Message | Greg Smith | 2012-04-16 20:23:09 | Re: Last gasp |