Re: To use a VIEW or not to use a View.....

From: Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomasz Myrta <jasiek(at)klaster(dot)net>, <pgsql-sql(at)postgresql(dot)org>
Subject: Re: To use a VIEW or not to use a View.....
Date: 2003-01-23 18:21:06
Message-ID: 20030123095104.I13544-100000@megazone23.bigpanda.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql


On Thu, 23 Jan 2003, Tom Lane wrote:

> regression=# explain analyze select * from tenk1 a join tenk1 b using(ten)
> regression-# where ten = 3;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Merge Join (cost=1055.45..2102.12 rows=83006 width=488) (actual time=582.97..65486.57 rows=1000000 loops=1)
> Merge Cond: ("outer".ten = "inner".ten)
> -> Sort (cost=527.73..530.00 rows=910 width=244) (actual time=373.57..382.48 rows=1000 loops=1)
> Sort Key: a.ten
> -> Seq Scan on tenk1 a (cost=0.00..483.00 rows=910 width=244) (actual time=8.98..330.39 rows=1000 loops=1)
> Filter: (ten = 3)
> -> Sort (cost=527.73..530.00 rows=910 width=244) (actual time=209.19..8057.64 rows=999001 loops=1)
> Sort Key: b.ten
> -> Seq Scan on tenk1 b (cost=0.00..483.00 rows=910 width=244) (actual time=0.40..193.93 rows=1000 loops=1)
> Filter: (3 = ten)
> Total runtime: 73291.01 msec
> (11 rows)

Yeah, I see it once I got the estimated selectivity being smaller in the
joins in my test database as well.

> Not only should we ignore the join condition for selectivity purposes,
> but it's a waste of time for execution as well. We could have
> implemented the above query as a nestloop with no join condition, and
> saved the effort of the sort and merge logic.
>
> What I was thinking was that any time the code sees a "var = const"
> clause as part of a mergejoin equivalence set, we could mark all the
> "var = var" clauses in the same set as no-ops. For example, given
>
> WHERE a.f1 = b.f2 AND b.f2 = c.f3 AND c.f3 = 42
>
> then after we finish deducing a.f1 = 42 and b.f2 = 42, there is no
> longer any value in either of the original clauses a.f1 = b.f2 and
> b.f2 = c.f3, nor in the other deduced clause a.f1 = c.f3. This would
> take a little bit of restructuring of generate_implied_equalities() and
> process_implied_equality(), but it doesn't seem too difficult to do.
>
> Thoughts? Are there any holes in that logic?

The main thing I can think of is being careful when the types are
different (like padding vs no padding in strings). Playing with text and
char() the explain output appears to be resulting in the right thing
for the clauses but I'm not 100% sure.

Given that it only appears to bring across equality conditions and
not the parts of conditions with or, I think you're right in general.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Ben Siders 2003-01-23 18:56:50 SQL to list databases?
Previous Message jasiek 2003-01-23 17:46:01 Re: To use a VIEW or not to use a View.....