Re: tsearch2: stop words and stemming separate?

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Sushant Sinha <sushant354(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: tsearch2: stop words and stemming separate?
Date: 2008-01-27 05:38:53
Message-ID: Pine.LNX.4.64.0801270824470.26876@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, 26 Jan 2008, Sushant Sinha wrote:

> I want to remove stop words but do not want to stem the words. Is there
> an interface in tsearch2 that allows me to do this?
>
> Basically I am trying to implement spelling corrections and do not want
> to correct stop words.

Create custom dictionary using simple (or just add stop words to simple)
and use it before english stemmer, which has NO stop words !

=# insert into pg_ts_dict
(SELECT 'remove_stopwords', dict_init,
'contrib/english.stop',
dict_lexize,
'simple dictionary with stop words'
FROM pg_ts_dict
WHERE dict_name = 'simple');

insert into pg_ts_dict
(SELECT 'en_stem_no_stopwords', dict_init,
'',
dict_lexize,
'english stemmer without stop words'
FROM pg_ts_dict
WHERE dict_name = 'en_stem');

>
> Thanks,
> -Sushant.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message mljv 2008-01-27 10:51:32 Very long execution time of "select nextval('..');"
Previous Message Sushant Sinha 2008-01-27 04:56:39 tsearch2: stop words and stemming separate?