Re: not in(subselect) in 8.4

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Grzegorz Jaśkiewicz <gryzman(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: not in(subselect) in 8.4
Date: 2009-02-22 04:39:49
Message-ID: 603c8f070902212039j346e0e27h96434f0f501e327b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Feb 21, 2009 at 10:41 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Scott Carey <scott(at)richrelevance(dot)com> writes:
>> Are there any optimizations planned for the case where columns are
>> defined as NOT NULL?
>
> We might get around to recognizing that case as an antijoin sometime.
> It's nontrivial though, because you have to check for an intermediate
> outer join causing the column to be possibly nullable after all.
>
>> It turns out to be a rare use case for someone to write a subselect
>> for a NOT IN or IN clause that will have NULL values.
>
> Judging from the steady flow of "why doesn't my NOT IN query work"
> newbie questions, I don't think it's so rare as all that.

I think it's rare to do it on purpose, precisely because of the weird
semantics we all hate. I have done it by accident, more than once,
and then fixed it by adding WHERE blah IS NOT NULL to the subquery.
So I think Scott is basically right.

> There's surely some population of people who know enough or could be
> trained to be careful about using NOT NULL columns, but they could also
> be trained to use NOT EXISTS, and dodge the whole bullet from the start.

There are far more important reasons to make columns NOT NULL than
avoiding strange results from NOT IN. Personally, I have gotten used
to the fact that the planner sucks at handling NOT IN and so always
write LEFT JOIN ... WHERE pk IS NULL, so it's not important to me that
we fix it. But it's certainly a foot-gun for the inexperienced, as it
is both the most compact and (at least IMO) the most intuitive
formulation of an anti-join.

...Robert

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-02-22 19:10:16 Re: not in(subselect) in 8.4
Previous Message Tom Lane 2009-02-22 03:41:41 Re: not in(subselect) in 8.4