Quick Links

Re: Avoiding sequential scans with OR join condition

From:	Janning Vygen <vygen(at)gmx(dot)de>
To:	Mike Mascari <mascarm(at)mascari(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Avoiding sequential scans with OR join condition
Date:	2004-10-16 12:26:53
Message-ID:	200410161426.53591.vygen@gmx.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Am Samstag, 16. Oktober 2004 07:23 schrieb Mike Mascari:
> Hello. I have a query like:
>
> SELECT big_table.*
> FROM little_table, big_table
> WHERE little_table.x = 10 AND
> little_table.y IN (big_table.y1, big_table.y2);
>
> I have indexes on both big_table.y1 and big_table.y2 and on
> little_table.x and little_table.y. The result is a sequential scan of
> big_table. In order to prevent this,

Maybe the postgres planner decided to choose a seq scan because the planner
thinks it is faster, and often it is right. Did you vacuum analyze before?

try:
VACCUM ANALYZE;
SET enable_seq_scan to off;
EXPLAIN ANALYZE <your query>
SET enable_seq_scan to on;
EXPLAIN ANALYZE <your query>

you will see why postgres planner did choose a seq scan and if it was right to
do so (but never disable seq scan on production environment, not even for one
query. you do not want it.)

(i hope syntax is correct otherwise consult the manual)

> I've rewritten the query as:
> SELECT big_table.*
> FROM little_table, big_table
> WHERE little_table.x = 10 AND
> little_table.y = big_table.y1
> UNION
> SELECT big_table.*
> FROM little_table, big_table
> WHERE little_table.x = 10 AND
> little_table.y = big_table.y2
>
> which does allow an index scan, but suffers from two separate queries
> along with a unique sort, which, from the data, represents 90% of the
> tuples returned by both queries.

this is the reason it seems why postgres choose a seq scan in the first query.
if it has to scan 90% of data anyway, it is faster than doing two index
lookups before.

> Is there any way to write the first query such that indexes will be used?

i do not know your db design but it looks queer to me to have a big_table with
two columns y1 and y2 which seems to have the same meaning (some value which
is compared to another value of little_table).

why dont you put just one column "y" in your big_table?

kind regards,
janning

> Mike Mascari

In response to

Avoiding sequential scans with OR join condition at 2004-10-16 05:23:09 from Mike Mascari

Browse pgsql-general by date

	From	Date	Subject
Next Message	John DeSoi	2004-10-16 14:08:45	Re: OS X Install
Previous Message	Katsaros Kwn/nos	2004-10-16 10:40:26	Re: Networking feature for postgresql...