Re: [tsvector] to_tsvector called multiple times

From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: "Sven R(dot) Kunze" <srkunze(at)tbz-pariv(dot)de>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: [tsvector] to_tsvector called multiple times
Date: 2015-05-26 09:05:50
Message-ID: CAF4Au4zaXvdLPaqOtoawB=bnGPO=3jdZO5EDLm4jAc8cCFvJtQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

You can ask http://snowball.tartarus.org/ for stemmer. Meanwhile,
you can have small personal dictionary (before stemmer) with such
exceptions, for example, use synonym template

system system

Oleg

On Tue, May 26, 2015 at 11:18 AM, Sven R. Kunze <srkunze(at)tbz-pariv(dot)de>
wrote:

> Hi everybody,
>
> the following stemming results made me curious:
>
> select to_tsvector('german', 'systeme'); > 'system':1
> select to_tsvector('german', 'systemes'); > 'system':1
> select to_tsvector('german', 'systems'); > 'system':1
> select to_tsvector('german', 'systemen'); > 'system':1
> select to_tsvector('german', 'system'); > 'syst':1
>
>
> First of all, this seems to be a bug in the German stemmer. Where can I
> fix it?
>
> Second, and more importantly, as I understand it, the stemmed version of a
> word should be considered normalized. That is, all other versions of that
> stem should be mapped to it as well. The interesting problem here is that
> PostgreSQL maps the stem itself ('system') to a completely different stem
> ('syst').
>
> Should a stem not remain stable even when to_tsvector is called on it
> multiple times?
>
> --
> Sven R. Kunze
> TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz
> Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920
> e-mail: srkunze(at)tbz-pariv(dot)de
> web: www.tbz-pariv.de
>
> Geschäftsführer: Dr. Reiner Wohlgemuth
> Sitz der Gesellschaft: Chemnitz
> Registergericht: Chemnitz HRB 8543
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sven R. Kunze 2015-05-26 09:29:53 Re: [tsvector] to_tsvector called multiple times
Previous Message Albe Laurenz 2015-05-26 09:01:44 Re: [tsvector] to_tsvector called multiple times