%tsearch gendict snowball spanish

From: David Gama Rodriguez <david(dot)gama(at)inegi(dot)gob(dot)mx>
To: pgsql-general(at)postgresql(dot)org
Subject: %tsearch gendict snowball spanish
Date: 2007-03-12 04:52:11
Message-ID: 1173675131.19035.0.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi everyone!!

I have an implementation of tsearch2 with spanish stemmers. I
updated
postgres to 8.1.8 version and I was going to reinstall the
tsearch2
contrib, everything was fine until I try to compile the spanish
stemmers
with gendict

$ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
stemmer for
Spanish'

Dictname: 'sp'
Snowball stemmer: yes
Has init method: yes
Function prefix: spanish_ISO_8859_1
Source files: stem.c
Header files: stem.h
Object files: stem.o dict_snowball.o
Comment: 'Snowball stemmer for Spanish'
Directory: ../../dict_sp
Build directory... ok
Build Makefile... ok
Build dict_sp.sql.in... ok
Copy source and header files... ok
Build sub-include header... ok
Build Snowball stemmer... ok
Build README.sp... ok
All is done

after I get this error:

stem.c: En la función 'spanish_ISO_8859_1_close_env':
stem.c:1092: error: demasiados argumentos para la función
'SN_close_env'
make: *** [stem.o] Error 1

error: too many arguments to function 'SN_close_env'

So I search this error on the list and I see some posts related.
One of this posts says that I have to patch to get an updated
snowball
API. I download the patch from:
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/gin_tsearch2_81.gz

and I apply the patch this way

$ cd PG_SRC/
$ patch -b -p0 < gin_tsearch2_81

Everything going fine and recompile tsearch2

So I try to compile the stemmer again
$ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
stemmer for
Spanish'
$ cd ../../dict_sp
$ make

stem.c: En la función 'spanish_ISO_8859_1_close_env':
stem.c:1092: error: demasiados argumentos para la función
'SN_close_env'
make: *** [stem.o] Error 1

And again I have the same error: too many arguments.....

So my question is why after I apply the patch I have the same
error?
What did I do wrong?

I take some paths to solve this issue I post here which finally
works
for me:

1.- Download Postgresql-8.1.8 sources
2.- Download the C implementation of snowball
http://snowball.tartarus.org/dist/libstemmer_c.tgz
3.- Download the patch to update Snowball API
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/gin_tsearch2_81.gz

4.- Unpack postgresql sources
5.- Unpack Snowball C
6.- Unpack the patch
7.- Apply patch with:
$ cp gin_tsearch2_81 PG_SRC/
$ cd PG_SRC
$ patch -b -p0 < gin_tsearch2_81
8.- Do configure
$ ./configure
9.- Copy the Snowball API to Tsearch2 dir
$ cp libstemmer_c/runtime/*
PG_SRC/contrib/tsearch2/snowball/
10.- Copy english and russian stemmers
$ cp stem_ISO_8859_1_english.c
PG_SRC/contrib/tsearch2/snowball/english_stem.c
$ cp stem_ISO_8859_1_english.h
PG_SRC/contrib/tsearch2/snowball/english_stem.h
$ cp stem_KOI8_R_russian.c
PG_SRC/contrib/tsearch2/snowball/russian_stem.c
$ cp stem_KOI8_R_russian.h
PG_SRC/contrib/tsearch2/snowball/russian_stem.h
$ cp stem_UTF_8_russian.h
PG_SRC/contrib/tsearch2/snowball/russian_stem_UTF8.h
$ cp stem_UTF_8_russian.c
PG_SRC/contrib/tsearch2/snowball/russian_stem_UTF8.c

11.- Change in english_stem.c, russian_stem.c,
rusian_stem_UTF8.c the
line with:
#include "../untime/header.h"
to:
#include "header.h"

12.- Compile tsearch2
$ make
$ make install

13.- Copy spanish stemmer
$ cp libstemmer_c/src_c/stem_ISO_8859_1_spanish.c
PG_SRC/contrib/tsearch2/gendict/stem.c
$ cp libstemmer_c/src_c/stem_ISO_8859_1_spanish.h
PG_SRC/contrib/tsearch2/gendict/stem.h

14.- Go to gendict directory and do the same sustitution in step
11 with
stem.c file
15.- Do:
$ ./config.sh -n sp -s -p spanish_ISO_8859_1 -v -C'Snowball
stemmer
for Spanish'

16.- Go to ../../dict_sp and compile
$ make

And have no errors finally this works, I have many doubts
related with
this way of add tsearch2 and snowball spanish like:

Is safe to add this to a DB in production?
This compilation it's "fine" I mean it's correct?
I will have some issues when I put this to work?

I know this is dificult to say but I ask you because you have
more
experience with this

Cheers mates!!



BTW I hope this mini HOW_TO helps others

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Geoff Russell 2007-03-12 05:14:12 Re: odbc can't edit postgresql database ?? -- Small example
Previous Message Tom Lane 2007-03-12 02:42:06 Re: passing an array type to a plpython procedure