From: | Oleg Bartunov <obartunov(at)gmail(dot)com> |
---|---|
To: | "Sven R(dot) Kunze" <srkunze(at)tbz-pariv(dot)de> |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: [to_tsvector] German Compound Words |
Date: | 2015-05-28 15:24:29 |
Message-ID: | CAF4Au4zS=usY=_azoYPeuiYsRbNVhATRQZCYjFWgF8W7cUrrvw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
ts_debug() ?
=# select * from ts_debug('english', 'messages');
alias | description | token | dictionaries | dictionary |
lexemes
-----------+-----------------+----------+----------------+--------------+----------
asciiword | Word, all ASCII | messages | {english_stem} | english_stem |
{messag}
On Thu, May 28, 2015 at 2:05 PM, Sven R. Kunze <srkunze(at)tbz-pariv(dot)de> wrote:
> Hi everybody,
>
> what do I need to do in order to enable compound word handling in
> PostgreSQL tsvector implementation?
>
> I run an Ubuntu 14.04 machine, PostgreSQL 9.3, have installed package
> hunspell-de-de and already created a new dictionary as described here:
> http://www.postgresql.org/docs/9.3/static/textsearch-dictionaries.html#TEXTSEARCH-ISPELL-DICTIONARY
>
> CREATE TEXT SEARCH DICTIONARY german_hunspell (
> TEMPLATE = ispell,
> DictFile = de_de,
> AffFile = de_de,
> StopWords = german
> );
>
> Furthermore, created a new test text search configuration (copied from
> german) and updated all parser parts where the german_stem dictionary is
> used so that it uses german_hunspell first and then german_stem.
>
> However, ts_vector still does not work for the compound words such as:
>
> wasserkraft -> wasserkraft, kraft
> schifffahrt -> schifffahrt, fahrt
> blindflansch -> blindflansch, flansch
>
> etc.
>
>
> What have I done wrong here?
>
> --
> Sven R. Kunze
> TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz
> Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920
> e-mail: srkunze(at)tbz-pariv(dot)de
> web: www.tbz-pariv.de
>
> Geschäftsführer: Dr. Reiner Wohlgemuth
> Sitz der Gesellschaft: Chemnitz
> Registergericht: Chemnitz HRB 8543
>
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
From | Date | Subject | |
---|---|---|---|
Next Message | Sven R. Kunze | 2015-05-28 15:34:44 | Re: [to_tsvector] German Compound Words |
Previous Message | Ravi Krishna | 2015-05-28 15:15:22 | Partitioning and performance |