From: | "Dmitry Koterov" <dmitry(at)koterov(dot)ru> |
---|---|
To: | "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su> |
Cc: | "Postgres General" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: How to switch off Snowball stemmer for tsearch2? |
Date: | 2007-08-23 13:10:01 |
Message-ID: | d7df81620708230610p5d20d009md959e6885c5c0aa3@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
>
> write your own dictionary, which implements any logic you need. In your
> case it's just a wrapper around ispell, which will returns original string
> not stem. See example
> http://www.sai.msu.su/~megera/postgres/fts/doc/fts-intdict-xmp.html
> and russian article
> http://www.sai.msu.su/~megera/postgres/talks/fts_pgsql_intro.html#ftsdict
Ah, I understand you!
You offer to write a small Postgres contrib module (new dictionary) in C and
implement all logic in it.
Seems it's a bit complex solution for such a simple task (exclude surnames
for lexization), but - it could be implemented, of course.
> > Of course I can create all word-forms of all Russian names using ispell
> and
> > then - subtract this full list from Ispell dictionary (so I will remove
> > "Ivan", "Ivanami" etc. from it). But possily tsearch2 has this
> subtraction
> > algorythm already.
> >
>
> don't do that ! Just go plain way.
>
Another method is to generate a singular ** synonym dictionary based on all
Russian names word-forms using ispell (we will get all suspicous surnames in
this set) and add it before ispell. This solution does not need to write
anything in C.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Glaesemann | 2007-08-23 13:10:31 | Re: Converting non-null unique idx to pkey |
Previous Message | Richard Huxton | 2007-08-23 12:57:00 | Re: %TYPE |