From: | Martin Hampl <Martin(dot)Hampl(at)stud(dot)uni-erlangen(dot)de> |
---|---|
To: | <pgsql-novice(at)postgresql(dot)org> <pgsql-novice(at)postgresql(dot)org> <pgsql-novice(at)postgresql(dot)org> |
Subject: | Does not use index |
Date: | 2004-04-28 18:50:17 |
Message-ID: | E42C0410-9944-11D8-8E8E-000393674318@stud.uni-erlangen.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Hi,
I'm trying to optimize a query and are stuck and don't understand why
PostgreSQL does not do what I want. It would be appreciated very much
if someone could have a look at it. Thanks in advance.
I am using PostgreSQL 7.4.1
I have the following tables:
Table "public.token"
Column | Type | Modifiers
----------+------------------------+-----------
text_id | integer | not null
position | integer | not null
word | character varying(255) |
Indexes:
"token_pkey" primary key, btree (text_id, "position")
"word_idx" btree (word)
and
Table "public.s"
Column | Type | Modifiers
---------+---------+-----------
text_id | integer | not null
s | integer | not null
start | integer |
stop | integer |
Indexes:
"s_pkey" primary key, btree (text_id, s)
"s_begin_idx" btree (text_id, "start")
"s_end_idx" btree (text_id, stop)
Foreign-key constraints:
"$1" FOREIGN KEY (text_id, "start") REFERENCES token(text_id,
"position")
"$2" FOREIGN KEY (text_id, stop) REFERENCES token(text_id,
"position")
(I hope it's readable)
and the following query:
select * from
token,
s
where
token.word = 'FACTSHEET' and
s.text_id = token.text_id and
s.start = token.position
PostgreSQL generates the following query plan
QUERY PLAN
------------------------------------------------------------------------
--------------------------
Nested Loop (cost=0.00..39120.95 rows=1 width=32)
-> Index Scan using s_pkey on s (cost=0.00..24698.44 rows=3367
width=16)
-> Index Scan using token_pkey on token (cost=0.00..4.27 rows=1
width=16)
Index Cond: (("outer".text_id = token.text_id) AND
("outer"."start" = token."position"))
Filter: ((word)::text = 'FACTSHEET'::text)
and the execution takes ages.
Now, 'FACTSHEET' is *very* seldom (5 of 23 million) and I intended it
to first search for 'FACTSHEET' and then use index s_begin_idx (and the
values of token.text_id and token.position) to derive the corresponding
records in s.
I tried to force PostgreSQL in doing it in this order by trying the
following:
select * from (select text_id, position from token where word =
'FACTSHEET') t left join s on (s.text_id = t.text_id and s.start =
t.position )
QUERY PLAN
------------------------------------------------------------------------
-----------------------
Hash Left Join (cost=24715.28..56630.78 rows=7892 width=24)
Hash Cond: (("outer".text_id = "inner".text_id) AND
("outer"."position" = "inner"."start"))
-> Index Scan using word_idx on token (cost=0.00..31402.51
rows=7892 width=8)
Index Cond: ((word)::text = 'FACTSHEET'::text)
-> Hash (cost=24698.44..24698.44 rows=3367 width=16)
-> Index Scan using s_pkey on s (cost=0.00..24698.44
rows=3367 width=16)
Now it *does* first search for 'FACTSHEET' but it still does not use
s_begin_idx and I have no idea why. Any ideas, what I could do?
Thanks,
Martin.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Morgan | 2004-04-28 20:12:32 | starting postgres/psql |
Previous Message | joseph speigle | 2004-04-28 14:15:00 | Re: error on starting server |