Re: IN() Optimization issue in 8.0rc5

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: IN() Optimization issue in 8.0rc5
Date: 2005-01-15 20:53:37
Message-ID: 11485.1105822417@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> dm=# explain
> dm-# SELECT personid FROM mr.person_attributes_old
> dm-# WHERE personid NOT IN (SELECT
> personid FROM mr.person_attributes);
> QUERY PLAN
> -----------------------------------------------------------------------------------
> Seq Scan on person_attributes_old (cost=0.00..3226144059.85 rows=235732
> width=4)
> Filter: (NOT (subplan))
> SubPlan
> -> Seq Scan on person_attributes (cost=0.00..12671.07 rows=405807
> width=4)
> (4 rows)

Hmm. What you want for a NOT IN is for it to say
Filter: (NOT (hashed subplan))
which you are not getting. What's the datatypes of the two personid
columns? Is the 400k-row estimate for person_attributes reasonable?
Maybe you need to increase work_mem (nee sort_mem) to allow a
400k-row hash table?

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-01-15 21:32:27 Re: IN() Optimization issue in 8.0rc5
Previous Message Josh Berkus 2005-01-15 20:23:10 IN() Optimization issue in 8.0rc5