Suggestions on finetuning this search?

From: "Justin Long" <justinlong(at)strategicnetwork(dot)org>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: Suggestions on finetuning this search?
Date: 2001-02-16 16:16:41
Message-ID: PBEMJIMKFNACLPCNPDEBIEODFLAA.justinlong@strategicnetwork.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I would welcome any suggestions for fine-tuning this search to run faster.

Here is the SQL. Basically what we're allowing people to do is to specify
words to search our article index.

THE TABLES:
knowledge = the knowledge base of articles
kb_categories = the category that each article is assigned to
kbwords = an index of every word in the knowledge base (kbid, wordid)
wordindex = an index of every word in the knowledge base (wordid, word)

AMPLIFICATION:
kbwords = a list of all the words that appear in a specific article (might
return a list of 5 articles where the word 'monk' appears)
wordindex = a unique list of all the words that appear in all the articles
(would only return a single entry for the word 'monk')

THE SQL:
select * from knowledge k, kb_categories c , kbwords w0 , kbwords w1 WHERE
k.catid=c.catid AND ((k.kbid=w0.kbid and w0.wordid=(SELECT wordid from
wordindex where word='BUDDIST')) AND (k.kbid=w1.kbid and w1.wordid=(SELECT
wordid from wordindex where word='MONK'))) ORDER BY k.regionid , k.ctryid ,
k.catid , k.title ;

WHAT "EXPLAIN" RETURNS:
Sort (cost=2796577.40..2796577.40 rows=2878549 width=332)
InitPlan
-> Index Scan using wordindex_word on wordindex (cost=0.00..247.71
rows=376 width=4)
-> Index Scan using wordindex_word on wordindex (cost=0.00..247.71
rows=376 width=4)
-> Merge Join (cost=21187.45..21993.59 rows=2878549 width=332)
-> Merge Join (cost=11060.50..11140.94 rows=52582 width=324)
-> Sort (cost=933.56..933.56 rows=960 width=316)
-> Hash Join (cost=1.43..885.97 rows=960 width=316)
-> Seq Scan on knowledge k (cost=0.00..559.25
rows=2825 width=284)
-> Hash (cost=1.34..1.34 rows=34 width=32)
-> Seq Scan on kb_categories c
(cost=0.00..1.34 rows=34 width=32)
-> Sort (cost=10126.95..10126.95 rows=5474 width=8)
-> Seq Scan on kbwords w0 (cost=0.00..9787.02
rows=5474 width=8)
-> Sort (cost=10126.95..10126.95 rows=5474 width=8)
-> Seq Scan on kbwords w1 (cost=0.00..9787.02 rows=5474
width=8)

Please e-mail suggestions to justinlong(at)strategicnetwork(dot)org(dot) Thanks!

To see the code in action, visit
http://www.strategicnetwork.org/index.asp?loc=kb

Browse pgsql-sql by date

  From Date Subject
Next Message Roberto Mello 2001-02-16 18:31:25 PL/PgSQL FOR syntax
Previous Message Steve Meynell 2001-02-16 14:41:09 Help Retrieving Latest Record