Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."

From: Mike Winter <mike(dot)winter(at)frontlogic(dot)com>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."
Date: 2003-05-15 21:44:29
Message-ID: Pine.LNX.4.33L2.0305151524590.6984-100000@frontlogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi, when doing queries of the type:

SELECT id FROM foo WHERE id IN (1, 4, 3, 2, 10, 11, 14) .., I get
terrible performance on tables of any resonable size. I see the
same behaviour when doing queries of the form "SELECT id FROM
foo WHERE id = 5 OR id = 6 OR ..."

When doing an "EXPLAIN" on the query, I get output like the
following:

Index Scan using foo_idx, foo_idx, foo_idx, foo_idx, foo_idx,
foo_idx on foo (cost=0.00..18.16 rows=6 width=4)

If the "IN (1, 2, 3, 6, ..., n)" clause is big enough, the
database will actually throw an error saying "Recursive Depth
Exceeded" or something similar and not complete the query.

It looks to me like the query parser is recursively calling
an index scan for each row in the 'IN' clause rather than just
doing one index scan that it seems it should be.

My question is, does anyone have any alternate ideas for how I
can do a query like this and have it perform well? The tables I
am working with are big enough that a sequential scan is not
helpful. Is this a bug I am encountering or an error in my
query? Is this a known issue?

I have seen this beahaviour on 7.2.1 and 7.3.2 on both Solaris
and Linux platforms.

Thanks for any input.

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jeff Eckermann 2003-05-15 22:13:06 Re: select 3 characters
Previous Message Richard Huxton 2003-05-15 21:28:15 Re: installation problem