Re: plan variations: join vs. exists vs. row comparison

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: plan variations: join vs. exists vs. row comparison
Date: 2011-03-07 20:00:10
Message-ID: 11707.1299528010@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> writes:
> I was hoping that somebody could help me understand the differences
> between three plans.
> All of the plans are updating a table using a second table, and should
> be logically equivalent.
> Two of the plans use joins, and one uses an exists subquery.
> One of the plans uses row constructors and IS NOT DISTINCT FROM. It is
> this plan which has really awful performance.
> Clearly it is due to the nested loop, but why would the planner choose
> that approach?

IS NOT DISTINCT FROM pretty much disables all optimizations: it can't be
an indexqual, merge join qual, or hash join qual. So it's not
surprising that you get a sucky plan for it. Possibly somebody will
work on improving that someday.

As for your other questions, what PG version are you using? Because I
do get pretty much the same plan (modulo a plain join versus a semijoin)
for the first two queries, when using 9.0 or later. And the results of
ANALYZE are only approximate, so you shouldn't be surprised at all if a
rowcount estimate is off by a couple percent. Most of the time, you
should be happy if it's within a factor of 2 of reality.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2011-03-07 20:05:42 Re: plan variations: join vs. exists vs. row comparison
Previous Message Robert Haas 2011-03-07 19:25:50 Re: Performance trouble finding records through related records