From: | Emre Hasegeli <emre(at)hasegeli(dot)com> |
---|---|
To: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PROPOSAL] Improvements of Hunspell dictionaries support |
Date: | 2015-11-07 14:20:28 |
Message-ID: | CAE2gYzwom3=11U9G8ZxMT5PLkZrwb12BWzxh4dB3HUd89FOSrg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Thank you for working on this.
I tried the patch with a Turkish dictionary [1] I could find on the
Internet. It worked for some words, but not others:
> hasegeli=# create text search dictionary hunspell_tr (template = ispell, dictfile = tr, afffile = tr);
> CREATE TEXT SEARCH DICTIONARY
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilki'); -- The root "fox"
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkinin'); -- Genitive form, affix 3290
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiler'); -- Plural form, affix 4371
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiyi'); -- Accusative form, affix 2646
> ts_lexize
> -----------
>
> (1 row)
It seems to have something to do with the order of the affixes. It
works, if I move affix 2646 to the beginning of the list.
[1] https://tr-spell.googlecode.com/files/dict_aff_5000_suffix_1130000_words.zip
From | Date | Subject | |
---|---|---|---|
Next Message | Vitaly Burovoy | 2015-11-07 14:47:17 | Extracting fields from 'infinity'::TIMESTAMP[TZ] |
Previous Message | Amit Kapila | 2015-11-07 14:16:35 | Re: Transactions involving multiple postgres foreign servers |