From: | Herouth Maoz <herouth(at)oumail(dot)openu(dot)ac(dot)il> |
---|---|
To: | Christian Rudow <Christian(dot)Rudow(at)thinx(dot)ch>, PostgreSQL SQL <pgsql-sql(at)postgreSQL(dot)org> |
Subject: | Re: [SQL] NOT IN clause performing badly |
Date: | 1999-07-26 11:12:02 |
Message-ID: | l03130303b3c1edc28fd2@[147.233.159.109] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
At 12:09 +0300 on 26/07/1999, Christian Rudow wrote:
> Questions
> ---------
> Is this a known problem ?
> What is causing the problem ?
> Is it a problem of RDBMS parameters ?
> Is there an alternative to the NOT IN clause ?
NOT IN is an inefficient way of running such a query. I would suggest:
- After creating tempx1, create an index on its id field
- Use the following instead of NOT IN:
select id from mytable
where NOT EXISTS (
select * from tempx1
where tempx1.id = mytable.id
);
What is the difference?
In the NOT IN version, for each of the 10000 ids in mytable, it has to
search through 8000 records to see whether the current mytable.id is there
or not. This would give you 80000000 operations. Especially since there is
no index on tempx1.
The EXISTS query I gave uses the index on tempx1.id for each of the 10000
ids. Thus, you have 10000 * log 8000, more or less.
But you have to count in the penalty of creating the index on tempx1.id in
the first place.
Herouth
--
Herouth Maoz, Internet developer.
Open University of Israel - Telem project
http://telem.openu.ac.il/~herutma
From | Date | Subject | |
---|---|---|---|
Next Message | John Reynolds | 1999-07-26 11:23:39 | Re: [SQL] RE: Tool for generating entity relation diagrams. |
Previous Message | D'Arcy J.M. Cain | 1999-07-26 10:30:57 | Re: [SQL] database design SQL prob. |