Re: [GENERAL] null and =

From: Mike Mascari <mascarm(at)mascari(dot)com>
To: Lincoln Yeoh <lylyeoh(at)mecomb(dot)com>
Cc: pgsql-general(at)postgreSQL(dot)org
Subject: Re: [GENERAL] null and =
Date: 1999-12-06 10:03:33
Message-ID: 384B89F5.A71EDE6@mascari.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Lincoln Yeoh wrote:

> At 10:39 PM 05-12-1999 -0500, Mike Mascari wrote:
> >I would just do:
> >
> >SELECT users.usr_id, users.usr_login FROM users WHERE NOT EXISTS
> > (SELECT wm_accounts.usr_id FROM wm_accounts WHERE
> > wm_accounts.usr_id = users.usr_id);
>
> OK, my current query is
> select usr_id,usr_login from users where usr_id not in (select userid from
> wm_accounts);
>
> Your query on small test tables (after vacuum analyze):
> Seq Scan on users (cost=1.83 rows=25 width=16)
>
> SubPlan
> -> Seq Scan on wm_accounts (cost=1.33 rows=2 width=4)
>
> My query:
> Seq Scan on users (cost=1.83 rows=25 width=16)
>
> SubPlan
> -> Seq Scan on wm_accounts (cost=1.33 rows=10 width=4)
>
> What does rows mean? But it looks like your query is better :). Don't fully
> understand why tho. Would it work if wm_accounts is empty?

Yes, certainly. If you envision what the backend is doing, for NOT EXISTS, its
something like this:

for each users record
perform an index or sequential lookup in wm_accounts for a matching usr_id:
found: continue
not found: output usr_id
next users record

If, however, you use the NOT IN clause, it looks more like this:

for each users record
for each wm_accounts record
if users.usr_id = wm_accounts.usr_id, continue to next users record
next wm_accounts record
output usr_id
next users record

At least with the EXISTS/NOT EXISTS method, you give the backend the opportunity
to use indexes on the correlated table. Most commercial databases will
instantiate a temporary table when processing IN clauses and will rewrite the
query as an EXISTS (or DISTINCT join/outer join). PostgreSQL doesn't do that at
the moment. What indexes do you have on users and wm_accounts? You should have
one on usr_id of both. It may simply be that the optimizer isn't using indexes
since the number of rows is small. Here is a pseudo-equivalent explain plan from
a production database:

explain select webuser from webusers where not exists (
select permitbuy.webuser from permitbuy where webusers.webuser =
permitbuy.webuser);

NOTICE: QUERY PLAN:

Seq Scan on webusers (cost=7.78 rows=145 width=12)
SubPlan
-> Index Scan using k_permitbuy1 on permitbuy (cost=4.36 rows=48 width=12)

EXPLAIN

vs. using IN:

explain select webuser from webusers where webuser not in (select webuser from
permitbuy);

NOTICE: QUERY PLAN:

Seq Scan on webusers (cost=7.78 rows=145 width=12)
SubPlan
-> Seq Scan on permitbuy (cost=32.05 rows=759 width=12)

EXPLAIN

Hope that helps,

Mike

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Silvio Macedo 1999-12-06 13:15:51 JDBC / JBUILDER
Previous Message Stephan Koepp 1999-12-06 10:01:19 subscribe