Re: omitting redundant join predicate

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ehab Galal <ehabgalal123(at)hotmail(dot)com>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: omitting redundant join predicate
Date: 2007-11-04 16:35:36
Message-ID: 14450.1194194136@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Ehab Galal <ehabgalal123(at)hotmail(dot)com> writes:
> explain select *
> from t1, t2, t3
> where t1.f <= t2.f
> and t2.f <= t3.f
> and t1.f <= t3.f;

> I was wondering if there is a
> way to omit the redundant join predicate.

You're not being very clear here. Do you mean will you get the same
answer if you omit "t1.f <= t3.f"? Yes, of course (ignoring possibly
different output ordering). Do you mean you think the system should
discard it as redundant? I disagree --- the more join clauses the
better, as a rule. Do you mean that the EXPLAIN output looks like
the same comparison is being applied twice? It isn't --- in a more
modern PG release the output looks like this:

QUERY PLAN
------------------------------------------------------------------
Nested Loop (cost=33.54..81794021.44 rows=362975624 width=12)
Join Filter: ((t1.f <= t2.f) AND (t2.f <= t3.f))
-> Nested Loop (cost=0.00..124472.40 rows=1526533 width=8)
Join Filter: (t1.f <= t3.f)
-> Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=4)
-> Seq Scan on t3 (cost=0.00..31.40 rows=2140 width=4)
-> Materialize (cost=33.54..54.94 rows=2140 width=4)
-> Seq Scan on t2 (cost=0.00..31.40 rows=2140 width=4)
(8 rows)

This is of course the stupidest possible join plan, but it's hard to do
much better --- both hash and merge joins work only on equality
conditions. You can do a bit better with an index on t2.f:

QUERY PLAN
----------------------------------------------------------------------
Nested Loop (cost=0.00..13222230.60 rows=362975624 width=12)
-> Nested Loop (cost=0.00..124472.40 rows=1526533 width=8)
Join Filter: (t1.f <= t3.f)
-> Seq Scan on t1 (cost=0.00..31.40 rows=2140 width=4)
-> Seq Scan on t3 (cost=0.00..31.40 rows=2140 width=4)
-> Index Scan using t2i on t2 (cost=0.00..5.01 rows=238 width=4)
Index Cond: ((t1.f <= t2.f) AND (t2.f <= t3.f))
(7 rows)

regards, tom lane

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Aymeric Levaux 2007-11-04 16:49:02 SELECT DISTINCT ... ORDER BY UPPER(column_name) fails
Previous Message Ehab Galal 2007-11-04 13:14:51 omitting redundant join predicate