Re: plan variations: join vs. exists vs. row comparison

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: plan variations: join vs. exists vs. row comparison
Date: 2011-03-07 20:05:42
Message-ID: AANLkTikGCt1pkDW60zA8sGWPJoLd3ktCEejHCiqM65xK@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Mar 7, 2011 at 1:07 PM, Jon Nelson <jnelson+pgsql(at)jamponi(dot)net> wrote:
> Originally, I posted to -general but I found some time to write some
> samples, and realized it's probably more of a performance question.
>
> The original post is here:
> http://archives.postgresql.org/pgsql-general/2011-03/msg00198.php
>
> I was hoping that somebody could help me understand the differences
> between three plans.
> All of the plans are updating a table using a second table, and should
> be logically equivalent.
> Two of the plans use joins, and one uses an exists subquery.
> One of the plans uses row constructors and IS NOT DISTINCT FROM. It is
> this plan which has really awful performance.

The problem is really coming from SQL: it requires row wise
comparisons to be of all fields in left to right order and the fact
that you can't match NULL to NULL with =.

If you have a table with a,b,c, (1,1,NULL) is not distinct from (1,2,3) becomes:
Filter: ((NOT (a IS DISTINCT FROM 1)) AND (NOT (b IS DISTINCT FROM 1))
AND (NOT (c IS DISTINCT FROM NULL::integer)))

At present postgresql does not have the facilities to turn that into
an index lookup. SQL doesn't allow the way you'd want to write this
the way you'd really like to:

select * from v where (a,b,c) = (1,1,NULL);

because the comparison can't be applied from a row to another row but
only between the member fields. You can cheat the system, but only if
you reserve a special index for that purpose:

create table v(a int, b int, c int);
create index on v(v);
select * from v where v = (1,1, NULL) will match as 'is not distinct
from' does, using the index. This is because composite type
comparison (as opposed to its fields) follows a different code path.
Confused yet? You can also use the above trick with a type if you are
not comparing all fields of 'v':

create type foo(a int, b int);
create index on v(((a,b)::foo));
select * from v where (a,b)::foo = (1,1);

will get you field subset comparison with index. Note if you do the
above, the index can only match on the entire composite, not
particular fields...

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jon Nelson 2011-03-07 20:06:56 Re: plan variations: join vs. exists vs. row comparison
Previous Message Tom Lane 2011-03-07 20:00:10 Re: plan variations: join vs. exists vs. row comparison