VERY strange query plan (LONG)

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: VERY strange query plan (LONG)
Date: 2000-08-09 15:31:16
Message-ID: Pine.GSO.3.96.SK.1000809175039.28016T-100000@ra
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I tried to implement fulltext search using linguistic approach,
for example, using ispell like udmsearch does. We also save position
information of each lexem in document to calculate relevancy
(it's C-function using SPI-interface). We're still testing different
strategies but found several problems with optimizer, just look at plan -
very strange numbers and no indices used) (I did run vacuume analyze)

explain
select
txt.tid
from
txt, txt_lexem1 tl1_0, txt_lexem11 tl11_0
where
tl1_0.lid =17700
OR
tl11_0.lid =172751
;
NOTICE: QUERY PLAN:

Nested Loop (cost=0.00..16275952180.00 rows=512819420786 width=12)
-> Nested Loop (cost=0.00..891369556.42 rows=512819421 width=8)
-> Seq Scan on txt_lexem11 tl11_0 (cost=0.00..2596.92 rows=132292 width=4)
-> Seq Scan on txt_lexem1 tl1_0 (cost=0.00..3815.95 rows=194795 width=4)
-> Seq Scan on txt (cost=0.00..20.00 rows=1000 width=4)

EXPLAIN

fulltext=# \d txt
Table "txt"
Attribute | Type | Modifier
-----------+---------+----------
tid | integer | not null
Index: txt_pkey

tables txt_lexemX look like:

fulltext=# \d txt_lexem1
Table "txt_lexem1"
Attribute | Type | Modifier
-----------+-----------+----------
tid | integer | not null
lid | integer | not null
did | integer | not null
count | integer | not null
pos | integer[] | not null
Index: txt_lexem1_key

We have rewrite using EXISTS and plan looks better !

select
txt.tid
from
txt
where
EXISTS ( select tid from txt_lexem1 tl1_0 where tl1_0.lid=17700 and tl1_0.did=0
and txt.tid=tl1_0.tid )
OR
EXISTS ( select tid from txt_lexem11 tl11_0 where tl11_0.lid=172751 and
tl11_0.did=0 and txt.tid=tl11_0.tid )
;

NOTICE: QUERY PLAN:

Seq Scan on txt (cost=0.00..7416.48 rows=1000 width=4)
SubPlan
-> Index Scan using txt_lexem1_key on txt_lexem1 tl1_0 (cost=0.00..3.95 rows=1 width=4)
-> Index Scan using txt_lexem11_key on txt_lexem11 tl11_0 (cost=0.00..3.45 rows=1 width=4)

EXPLAIN

I've tested on plain 7.0.2 and CVS version.
I remind there was old problem with OR. Does optimizer still has
such problem ?

Regards,

Oleg

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephan Szabo 2000-08-09 17:52:17 Re: Arrays and foreign keys
Previous Message Michael Mayo 2000-08-09 14:46:56 Re: Activating USE_SYSLOG from srpm?