Re: %tsearch gendict snowball spanish

From: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>
To: David Gama Rodriguez <david(dot)gama(at)inegi(dot)gob(dot)mx>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: %tsearch gendict snowball spanish
Date: 2007-03-12 09:33:47
Message-ID: Pine.LNX.4.64.0703121231270.400@sn.sai.msu.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

David,

you need http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/tsearch_snowball_82.gz -
patch for 8.2 release, which updates snowball API.
You need it only for the new stemmers from sbowball site !

I'm not sure if it will apply for 8.1.8.

Oleg

On Sun, 11 Mar 2007, David Gama Rodriguez wrote:

> Hi everyone!!
>
> I have an implementation of tsearch2 with spanish stemmers. I
> updated
> postgres to 8.1.8 version and I was going to reinstall the
> tsearch2
> contrib, everything was fine until I try to compile the spanish
> stemmers
> with gendict
>
> $ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
> stemmer for
> Spanish'
>
> Dictname: 'sp'
> Snowball stemmer: yes
> Has init method: yes
> Function prefix: spanish_ISO_8859_1
> Source files: stem.c
> Header files: stem.h
> Object files: stem.o dict_snowball.o
> Comment: 'Snowball stemmer for Spanish'
> Directory: ../../dict_sp
> Build directory... ok
> Build Makefile... ok
> Build dict_sp.sql.in... ok
> Copy source and header files... ok
> Build sub-include header... ok
> Build Snowball stemmer... ok
> Build README.sp... ok
> All is done
>
> after I get this error:
>
> stem.c: En la funci?n 'spanish_ISO_8859_1_close_env':
> stem.c:1092: error: demasiados argumentos para la funci?n
> 'SN_close_env'
> make: *** [stem.o] Error 1
>
> error: too many arguments to function 'SN_close_env'
>
> So I search this error on the list and I see some posts related.
> One of this posts says that I have to patch to get an updated
> snowball
> API. I download the patch from:
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/gin_tsearch2_81.gz
>
> and I apply the patch this way
>
> $ cd PG_SRC/
> $ patch -b -p0 < gin_tsearch2_81
>
> Everything going fine and recompile tsearch2
>
> So I try to compile the stemmer again
> $ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
> stemmer for
> Spanish'
> $ cd ../../dict_sp
> $ make
>
> stem.c: En la funci?n 'spanish_ISO_8859_1_close_env':
> stem.c:1092: error: demasiados argumentos para la funci?n
> 'SN_close_env'
> make: *** [stem.o] Error 1
>
> And again I have the same error: too many arguments.....
>
> So my question is why after I apply the patch I have the same
> error?
> What did I do wrong?
>
> I take some paths to solve this issue I post here which finally
> works
> for me:
>
> 1.- Download Postgresql-8.1.8 sources
> 2.- Download the C implementation of snowball
> http://snowball.tartarus.org/dist/libstemmer_c.tgz
> 3.- Download the patch to update Snowball API
> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/gin_tsearch2_81.gz
>
> 4.- Unpack postgresql sources
> 5.- Unpack Snowball C
> 6.- Unpack the patch
> 7.- Apply patch with:
> $ cp gin_tsearch2_81 PG_SRC/
> $ cd PG_SRC
> $ patch -b -p0 < gin_tsearch2_81
> 8.- Do configure
> $ ./configure
> 9.- Copy the Snowball API to Tsearch2 dir
> $ cp libstemmer_c/runtime/*
> PG_SRC/contrib/tsearch2/snowball/
> 10.- Copy english and russian stemmers
> $ cp stem_ISO_8859_1_english.c
> PG_SRC/contrib/tsearch2/snowball/english_stem.c
> $ cp stem_ISO_8859_1_english.h
> PG_SRC/contrib/tsearch2/snowball/english_stem.h
> $ cp stem_KOI8_R_russian.c
> PG_SRC/contrib/tsearch2/snowball/russian_stem.c
> $ cp stem_KOI8_R_russian.h
> PG_SRC/contrib/tsearch2/snowball/russian_stem.h
> $ cp stem_UTF_8_russian.h
> PG_SRC/contrib/tsearch2/snowball/russian_stem_UTF8.h
> $ cp stem_UTF_8_russian.c
> PG_SRC/contrib/tsearch2/snowball/russian_stem_UTF8.c
>
> 11.- Change in english_stem.c, russian_stem.c,
> rusian_stem_UTF8.c the
> line with:
> #include "../untime/header.h"
> to:
> #include "header.h"
>
> 12.- Compile tsearch2
> $ make
> $ make install
>
> 13.- Copy spanish stemmer
> $ cp libstemmer_c/src_c/stem_ISO_8859_1_spanish.c
> PG_SRC/contrib/tsearch2/gendict/stem.c
> $ cp libstemmer_c/src_c/stem_ISO_8859_1_spanish.h
> PG_SRC/contrib/tsearch2/gendict/stem.h
>
> 14.- Go to gendict directory and do the same sustitution in step
> 11 with
> stem.c file
> 15.- Do:
> $ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
> stemmer
> for Spanish'
>
> 16.- Go to ../../dict_sp and compile
> $ make
>
> And have no errors finally this works, I have many doubts
> related with
> this way of add tsearch2 and snowball spanish like:
>
> Is safe to add this to a DB in production?
> This compilation it's "fine" I mean it's correct?
> I will have some issues when I put this to work?
>
> I know this is dificult to say but I ask you because you have
> more
> experience with this
>
> Cheers mates!!
>
>
>
> BTW I hope this mini HOW_TO helps others
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

In response to

Browse pgsql-general by date

  From Date Subject
Next Message cedric 2007-03-12 10:38:09 Re: Tsearch2 / Create rule on select
Previous Message Albe Laurenz 2007-03-12 09:10:10 Re: DBD::Pg/perl question, kind of...