Re: IN surpasses NOT EXISTS in 7.4RC2 ??

From: Rajesh Kumar Mallah <mallah(at)trade-india(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: IN surpasses NOT EXISTS in 7.4RC2 ??
Date: 2003-11-14 19:27:41
Message-ID: 3FB52CAD.9070404@trade-india.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane wrote:

>Rajesh Kumar Mallah <mallah(at)trade-india(dot)com> writes:
>
>
>>NOT EXISTS is taking almost double time than NOT IN .
>>I know IN has been optimised in 7.4 but is anything
>>wrong with the NOT EXISTS?
>>
>>
>
>That's the expected behavior in 7.4. EXISTS in the style you are using
>it effectively forces a nestloop-with-inner-indexscan implementation.
>As of 7.4, IN can do that, but it can do several other things too,
>including the hash-type plan you have here. So assuming that the
>planner chooses the right plan choice (not always a given ;-))
>

>IN should be as fast or faster than EXISTS in *all* *cases.*
>

Not in this case :) , did i miss something silly?

tradein_clients=# explain SELECT count(*) from user_accounts where
email is not null and email not in
(select email from profile_master where email is not null) ;
QUERY PLAN
--------------------------------------------------------------------------------------
Aggregate (cost=*9587726326.93..9587726326.93* rows=1 width=0)
-> Seq Scan on user_accounts (cost=0.00..9587725473.40 rows=341412
width=0)
Filter: ((email IS NOT NULL) AND (NOT (subplan)))
SubPlan
-> Seq Scan on profile_master (cost=0.00..25132.24
rows=674633 width=25)
Filter: (email IS NOT NULL)
(6 rows)

*The query above does not return*

tradein_clients=# explain analyze SELECT count(*) from user_accounts
where email is not null and
not exists (select email from profile_master where
email=user_accounts.email) ;

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=2850847.55..2850847.55 rows=1 width=0) (actual
time=34075.100..34075.101 rows=1 loops=1)
-> Seq Scan on user_accounts (cost=0.00..2849994.02 rows=341412
width=0) (actual time=8.066..34066.329 rows=3882 loops=1)
Filter: ((email IS NOT NULL) AND (NOT (subplan)))
SubPlan
-> Index Scan using profile_master_email on profile_master
(cost=0.00..35.60 rows=9 width=25) (actual time=0.044..0.044 rows=1
loops=686716)
Index Cond: ((email)::text = ($0)::text)
Total runtime: 34075.213 ms
(7 rows)

tradein_clients=#

>
> regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>

--

Rajesh Kumar Mallah,
Infocom Network Limited, New Delhi
phone: +91(11)6152172 (221) (L) ,9811255597 (M)

Visit http://www.trade-india.com ,
India's Leading B2B eMarketplace.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Will LaShell 2003-11-14 19:40:23 Re: Help with count(*)
Previous Message Christopher Browne 2003-11-14 19:16:56 Re: Help with count(*)