Re: Trying to get postgres to use an index

From: Joel Stevenson <joelstevenson(at)mac(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Trying to get postgres to use an index
Date: 2004-11-06 22:17:19
Message-ID: p06110425bdb2fb527182@[192.168.0.9]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

At 10:11 PM +0100 11/6/04, Pierre-Frédéric Caillaud wrote:
>>explain select notificationID from NOTIFICATION n, ITEM i where
>>n.itemID = i.itemID;
>> QUERY PLAN
>>
>>------------------------------------------------------------------------
>>------
>> Hash Join (cost=47162.85..76291.32 rows=223672 width=44)
>> Hash Cond: ("outer".itemid = "inner".itemid)
>> -> Seq Scan on notification n (cost=0.00..12023.71 rows=223671
>>width=48)
>> -> Hash (cost=42415.28..42415.28 rows=741028 width=4)
>> -> Seq Scan on item i (cost=0.00..42415.28 rows=741028
>>width=4)
>>
>>This query takes about 20 seconds to run.
>
> Well, you're joining the entire two
>tables, so yes, the seq scan might be faster.
> Try your query with enable_seqscan=0 so
>it'll use an index scan and compare the times.
> You may be surprised to find that the
>planner has indeed made the right choice.
> This query selects 223672 rows, are you surprised it's slow ?

I'm not a SQL guru by any stretch but would a
constrained sub-select be appropriate here?

e.g. a simple test setup where each record in
table test1 has a FK referenced to an entry in
test:

joels=# \d test
Table "public.test"
Column | Type | Modifiers
--------+--------------+-----------
id | integer | not null
foo | character(3) |
Indexes:
"test_pkey" primary key, btree (id)

joels=# \d test1
Table "public.test1"
Column | Type | Modifiers
---------+---------+-----------
id | integer | not null
test_id | integer |
Indexes:
"test1_pkey" primary key, btree (id)
"test1_test_id_idx" btree (test_id)
Foreign-key constraints:
"$1" FOREIGN KEY (test_id) REFERENCES test(id) ON DELETE CASCADE

joels=# select count(*) from test;
count
-------
10001
(1 row)

joels=# select count(*) from test1;
count
-------
10001
(1 row)

joels=# explain select test_id from test1 t1, test t where t1.test_id = t.id;
QUERY PLAN
------------------------------------------------------------------------
Hash Join (cost=170.01..495.05 rows=10002 width=4)
Hash Cond: ("outer".test_id = "inner".id)
-> Seq Scan on test1 t1 (cost=0.00..150.01 rows=10001 width=4)
-> Hash (cost=145.01..145.01 rows=10001 width=4)
-> Seq Scan on test t (cost=0.00..145.01 rows=10001 width=4)
(5 rows)

joels=# explain select test_id from test1 t1
where test_id in (select id from test where id =
t1.test_id);
QUERY PLAN
------------------------------------------------------------------------------
Seq Scan on test1 t1 (cost=0.00..15269.02 rows=5001 width=4)
Filter: (subplan)
SubPlan
-> Index Scan using test_pkey on test (cost=0.00..3.01 rows=2 width=4)
Index Cond: (id = $0)
(5 rows)

So with the subselect the query planner would use
the primary key index on test when finding
referencing records in the test1 table.

Pierre, I seen the advice to use an additional
where condition in certain cases to induce an
index scan; how is this done?

my 1.2 pennies,
-Joel

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message mike.wertheim 2004-11-06 22:50:14 Re: Trying to get postgres to use an index
Previous Message Franco Bruno Borghesi 2004-11-06 22:12:16 Re: Mass Import/Generate PKs