Re: Tsearch2 Dutch snowball stemmer in PG8.1

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: Alban Hertroys <a(dot)hertroys(at)magproductions(dot)nl>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Tsearch2 Dutch snowball stemmer in PG8.1
Date: 2007-10-03 15:12:27
Message-ID: Pine.LNX.4.64.0710031911200.3304@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 3 Oct 2007, Alban Hertroys wrote:

> Alban Hertroys wrote:
>> The only odd thing is that to_tsvector('dutch', 'some dutch text') now
>> returns '|' for stop words...
>>
>> For example:
>> select to_tsvector('nederlands', 'De beste stuurlui staan aan wal');
>> to_tsvector
>> ------------------------------------------------
>> '|':1,5 'bes':2 'wal':6 'staan':4 'stuurlui':3
>
> I found the cause. The stop words list I found contained comments
> prefixed by '|' signs. Removing the contents and recreating the database
> solved the problem. Just updating the reference didn't seem to help...

you need to recreate tsvector field and index, after changing any dicts.

>
> There's undoubtedly some cleaner way to replace the stop words list, but
> at the current stage of our project this was the simplest to achieve.
>
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Erik Jones 2007-10-03 15:13:26 Re: pg_cancel_backend() does not work with buzz queries
Previous Message Oleg Bartunov 2007-10-03 15:10:28 Re: Tsearch2 Dutch snowball stemmer in PG8.1