From: | "Poul L(dot) Christiansen" <poulc(at)cs(dot)auc(dot)dk> |
---|---|
To: | Mitch Vincent <mitch(at)venux(dot)net> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: full text indexing |
Date: | 2000-10-03 13:24:38 |
Message-ID: | 39D9DE16.C9290A2D@cs.auc.dk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Mitch Vincent wrote:
>
> > Hi
> >
> > In my PostgreSQL database I have a lot of newspaper articles (size:
> > 100mb now, growing beyond 1gb within few months).
> > I wan't to use full text indexing so that users can search the articles
> > with a keyword and have the results in less than one second.
> >
> > How do I accomplish that?
>
> It's complicated :-)
>
> > Does PostgreSQL have this feature?
>
> Nope.
>
> > Which 3rd party indexing tools are available that easily interoperate
> > with PostgreSQL?
>
> There is some code in contrib called FTI (Full Text Index - no doubt).. I
> re-wrote it for my uses but ended up not using because I was doing so many
> sorts and joins, it made a scan (even an index scan) of a table with 3
> million rows in it very slow. However id you were just searching the keyword
> table itself it was very, very fast.
>
> The FTI trigger in the contrib breaks the words down to 2 letter bits (for
> substring searching) -- mine doesn't, it only indexes whole words without
> duplicates and looks at a list of words not to index (words like a, an, the,
> anything else you want -- I think it has the 300 most used English words in
> there already)
>
> It's drawback is speed, it does take a few seconds to index on INSERT (and
> UPDATE) -- that's if your text fields are 30ish k (very close to the max PG
> can store)..
>
> I would say you're going to run into the 32k limit pretty quick with
> newspaper articles -- I index resumes and I've run into it many times (32k
> of text really isn't all that much)..
I split the articles up. When pg 7.1 is out, I will merge them together
again :)
>
> Anyway, I'll try and get that trigger together that I did and send it to the
> PG guys to see if it's worthy of being added to contrib -- I'll send you a
> copy in private if you'd like. Note: I think there have been more people to
> re-write that trigger, I haven't seen anything else though..
>
Where can i find the FTI code?
I looked in ftp://ftp.postgresql.org/pub/contrib/ , but no luck :(
Thanks for the reply,
Poul L. Christiansen
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Mount | 2000-10-03 13:30:24 | Re: Re: JDBC Performance |
Previous Message | chris markiewicz | 2000-10-03 13:07:11 | RE: trouble with trigger/function??? |