Re: How to switch off Snowball stemmer for tsearch2?

From: "Dmitry Koterov" <dmitry(at)koterov(dot)ru>
To: "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>
Cc: "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to switch off Snowball stemmer for tsearch2?
Date: 2007-08-23 13:10:01
Message-ID: d7df81620708230610p5d20d009md959e6885c5c0aa3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> write your own dictionary, which implements any logic you need. In your
> case it's just a wrapper around ispell, which will returns original string
> not stem. See example
> http://www.sai.msu.su/~megera/postgres/fts/doc/fts-intdict-xmp.html
> and russian article
> http://www.sai.msu.su/~megera/postgres/talks/fts_pgsql_intro.html#ftsdict

Ah, I understand you!
You offer to write a small Postgres contrib module (new dictionary) in C and
implement all logic in it.
Seems it's a bit complex solution for such a simple task (exclude surnames
for lexization), but - it could be implemented, of course.

> > Of course I can create all word-forms of all Russian names using ispell
> and
> > then - subtract this full list from Ispell dictionary (so I will remove
> > "Ivan", "Ivanami" etc. from it). But possily tsearch2 has this
> subtraction
> > algorythm already.
> >
>
> don't do that ! Just go plain way.
>

Another method is to generate a singular ** synonym dictionary based on all
Russian names word-forms using ispell (we will get all suspicous surnames in
this set) and add it before ispell. This solution does not need to write
anything in C.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Glaesemann 2007-08-23 13:10:31 Re: Converting non-null unique idx to pkey
Previous Message Richard Huxton 2007-08-23 12:57:00 Re: %TYPE