From: | "Derek Barrett" <derekbarrett(at)graffiti(dot)net> |
---|---|
To: | <andrew(at)catalyst(dot)net(dot)nz> |
Cc: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Question regarding contrib/fulltextindexing |
Date: | 2002-07-14 22:41:02 |
Message-ID: | 20020714224102.22447.qmail@graffiti.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Thanks Andrew, that solution seems to be easier in my mind.
Okay, so I create a lookup table for the description field.
CREATE TABLE lookup_description (
pk integer
id integer
search_word varchar(50)
)
(pk is the primary key, id being a foreign key)
(By the way, any recommendation in sizing the search_word field? Are there strings that are large enough that aren't worth indexing?)
My user INSERTS the following string:
"The quick brown fox jumped over the moon and another fox."
In my code, I use a for loop, and take the text string and put it into an array, and fill up the lookup_description table. Of course, I will create a noisewords filter to remove words like the, a, an from this list. What about duplicate words? Should those be filtered out as well? In this example, fox is duplicated. I would assume that leaving in the duplicates, might be useful later if I decide to implement a relevency-type of search engine. (Rank the results based on how many times, "fox" is found).
TABLE lookup_description
pk id search_word
-- -- -----------
1 1 quick
2 1 brown
3 1 fox
4 1 jumped
5 1 over
6 1 moon
7 1 another
8 1 fox
Then I can create an index on the search_word column. Later when I do my SELECT query, I will join this lookup query to the main query.
Is that the idea?
Derek
----- Original Message -----
From: Andrew McMillan <andrew(at)catalyst(dot)net(dot)nz>
Date: 14 Jul 2002 12:16:59 +1200
To: Derek Barrett <derekbarrett(at)graffiti(dot)net>
Subject: Re: [NOVICE] Question regarding contrib/fulltextindexing
> On Sun, 2002-07-14 at 08:53, Derek Barrett wrote:
> >
> > In my situation, I need to match exact words, so I've used regular expressions to search on a varchar(10000) field:
> >
> > SELECT *
> > FROM table
> > WHERE description ~* ('[^a-zA-Z0-9]($keyword[$x])[^a-zA-Z0-9]');
> >
> > Would this module still be useful in my situation?
>
> I was doing word search with a modified version of the fulltextindex
> code, but in the end I found it easier to write a perl program to do
> build the index table - the trigger approach was more work to manage
> than it seemed it should be.
>
> Regards,
> Andrew.
> --
> --------------------------------------------------------------------
> Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St, Wellington
> WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
> DDI: +64(4)916-7201 MOB: +64(21)635-694 OFFICE: +64(4)499-2267
> Are you enrolled at http://schoolreunions.co.nz/ yet?
>
>
--
_______________________________________________
Get your free email from http://www.graffiti.net
Powered by Outblaze
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew McMillan | 2002-07-14 23:39:12 | Re: Question regarding contrib/fulltextindexing |
Previous Message | Norman Khine | 2002-07-14 21:41:00 | Re: Multiple table insert using a CSV list as the datasource |