From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | VERY strange query plan (LONG) |
Date: | 2000-08-09 15:31:16 |
Message-ID: | Pine.GSO.3.96.SK.1000809175039.28016T-100000@ra |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I tried to implement fulltext search using linguistic approach,
for example, using ispell like udmsearch does. We also save position
information of each lexem in document to calculate relevancy
(it's C-function using SPI-interface). We're still testing different
strategies but found several problems with optimizer, just look at plan -
very strange numbers and no indices used) (I did run vacuume analyze)
explain
select
txt.tid
from
txt, txt_lexem1 tl1_0, txt_lexem11 tl11_0
where
tl1_0.lid =17700
OR
tl11_0.lid =172751
;
NOTICE: QUERY PLAN:
Nested Loop (cost=0.00..16275952180.00 rows=512819420786 width=12)
-> Nested Loop (cost=0.00..891369556.42 rows=512819421 width=8)
-> Seq Scan on txt_lexem11 tl11_0 (cost=0.00..2596.92 rows=132292 width=4)
-> Seq Scan on txt_lexem1 tl1_0 (cost=0.00..3815.95 rows=194795 width=4)
-> Seq Scan on txt (cost=0.00..20.00 rows=1000 width=4)
EXPLAIN
fulltext=# \d txt
Table "txt"
Attribute | Type | Modifier
-----------+---------+----------
tid | integer | not null
Index: txt_pkey
tables txt_lexemX look like:
fulltext=# \d txt_lexem1
Table "txt_lexem1"
Attribute | Type | Modifier
-----------+-----------+----------
tid | integer | not null
lid | integer | not null
did | integer | not null
count | integer | not null
pos | integer[] | not null
Index: txt_lexem1_key
We have rewrite using EXISTS and plan looks better !
select
txt.tid
from
txt
where
EXISTS ( select tid from txt_lexem1 tl1_0 where tl1_0.lid=17700 and tl1_0.did=0
and txt.tid=tl1_0.tid )
OR
EXISTS ( select tid from txt_lexem11 tl11_0 where tl11_0.lid=172751 and
tl11_0.did=0 and txt.tid=tl11_0.tid )
;
NOTICE: QUERY PLAN:
Seq Scan on txt (cost=0.00..7416.48 rows=1000 width=4)
SubPlan
-> Index Scan using txt_lexem1_key on txt_lexem1 tl1_0 (cost=0.00..3.95 rows=1 width=4)
-> Index Scan using txt_lexem11_key on txt_lexem11 tl11_0 (cost=0.00..3.45 rows=1 width=4)
EXPLAIN
I've tested on plain 7.0.2 and CVS version.
I remind there was old problem with OR. Does optimizer still has
such problem ?
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Stephan Szabo | 2000-08-09 17:52:17 | Re: Arrays and foreign keys |
Previous Message | Michael Mayo | 2000-08-09 14:46:56 | Re: Activating USE_SYSLOG from srpm? |