From: | Emre Hasegeli <emre(at)hasegeli(dot)com> |
---|---|
To: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: FTS Configuration option |
Date: | 2016-10-12 12:08:25 |
Message-ID: | CAE2gYzzdqjeCpPk-BU1AWkFsn1yZocBWpAYd1WLA-EFy13Ozgg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> => ALTER TEXT SEARCH CONFIGURATION multi_conf
> ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
> word, hword, hword_part
> WITH german_ispell (JOIN), english_ispell, simple;
I have something like this in my mind since I dealt with FTS for a
Turkish real estate listing application. Being able to pipe output of
some dictionaries is a nice feature we have since 9.0, but it is not
always sufficient. I think it is wrong to decide this per dictionary
bases. Something slightly more complicated to connect dictionaries
parallel or serial to each other might be more useful.
My problem was related to the special characters on Turkish (ç, ğ, ı,
ö, ü). It is very common to just type 7-bit-close-looking characters
(c, g, i, o, u) instead of those. Unaccent extension changes them as
desired, and passes the altered words to the subsequent dictionary,
when this configuration is changed like this:
> ALTER TEXT SEARCH CONFIGURATION turkish
> ALTER MAPPING FOR word, hword, hword_part
> WITH unaccent, turkish_stem;
However then the stemmer doesn't do a good job on those words, because
the changed characters are important for the language. What I really
needed was something like this:
> ALTER TEXT SEARCH CONFIGURATION turkish
> ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part
> WITH (fix_mistyped_characters AND (turkish_hunspell OR turkish_stem) AND unaccent);
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2016-10-12 12:32:52 | Re: macaddr 64 bit (EUI-64) datatype support |
Previous Message | Shay Rojansky | 2016-10-12 11:51:47 | Re: PATCH: Batch/pipelining support for libpq |