Quick Links

Re: full outer performance problem

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Kim Bisgaard <kib+pg(at)dmi(dot)dk>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: full outer performance problem
Date:	2005-06-08 14:03:35
Message-ID:	1275.1118239415@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Kim Bisgaard <kib+pg(at)dmi(dot)dk> writes:
> SELECT station_id, timeobs,temp_grass, temp_dry_at_2m
> FROM temp_dry_at_2m a
> FULL OUTER JOIN temp_grass b
> USING (station_id, timeobs)
> WHERE station_id = 52981
> AND timeobs = '2004-1-1 0:0:0'

> explain analyse SELECT b.station_id, b.timeobs,temp_grass, temp_dry_at_2m
> FROM temp_dry_at_2m a
> FULL OUTER JOIN temp_grass b
> USING (station_id, timeobs)
> WHERE b.station_id = 52981
> AND b.timeobs = '2004-1-1 0:0:0'

> Why will PostgreSQL not use the same plan for both these queries - they
> are virtually identical??

Because they're semantically completely different. The second query is
effectively a RIGHT JOIN, because join rows in which b is all-null will
be thrown away by the WHERE. The optimizer sees this (note your second
plan doesn't use a Full Join step anywhere) and is able to produce a
much better plan. Full outer join is difficult to optimize, in part
because we have no choice but to use a merge join for it --- the other
join types don't support full join.

regards, tom lane

In response to

full outer performance problem at 2005-06-08 09:37:40 from Kim Bisgaard

Responses

Re: full outer performance problem at 2005-06-08 18:46:35 from Kim Bisgaard

Browse pgsql-performance by date

	From	Date	Subject
Next Message	K C Lau	2005-06-08 14:25:16	Re: SELECT DISTINCT Performance Issue
Previous Message	Grega Bremec	2005-06-08 13:49:16	Re: Filesystem