From: | Ben <bench(at)silentmedia(dot)com> |
---|---|
To: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Cc: | Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: making tsearch2 dictionaries |
Date: | 2004-02-17 16:24:25 |
Message-ID: | 1077035064.22989.16.camel@purple |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tue, 2004-02-17 at 03:15, Oleg Bartunov wrote:
> Do you want '100' or 'hundred' will be fully equivalent ? So,
> if you search '100' you will find document with 'hundred'. Interesting,
> that you will find '123', because '123' will be 'one hundred twenty three'.
Yeah, for a general case of documents I'm not sure how accurate it would
make things, but I'm trying to index music artist names and song titles,
where I'd get things like "3 Dog Night".... or is that "Three Dog
Night"? :)
> What's the problem ? You may configure which dictionaries and in what order
> should be used for given type of token (pg_ts_cfgmap table).
> Aha, I got your problem:
> Once word is recognized by synonym dictionary it will not pass to
> next dictionary ! This is how tsearch2 is working with any dictionary.
Yep, that's my problem. :) And it seems that if I could pass the normal
words into an ispell dictionary before passing them on to the en_stem
dictionary, I'd get spell checking for free. Unless there's a better way
to give "did you mean: <your search spelled correctly>?" results....?
I know doing this would increase the size of the generated ts_vector,
but for my case, where what I'm indexing is generally only a few words
anyway, that's not an issue. As it is, I'm already going to get rid of
the stop words file, so that I can actually find things like "The Who."
How hard do you think it would be to change up the behavior to make this
happen? I
> What do you want from parser ?
I want to be able to recognize symbols, such as the degree (°) and
vulgar half (½) symbols.
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2004-02-17 16:51:03 | Re: making tsearch2 dictionaries |
Previous Message | Bruce Momjian | 2004-02-17 16:10:42 | Re: Replication in postgresql |