Quick Links

Re: [PROPOSAL] Improvements of Hunspell dictionaries support

From:	Emre Hasegeli <emre(at)hasegeli(dot)com>
To:	Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Date:	2015-11-07 14:20:28
Message-ID:	CAE2gYzwom3=11U9G8ZxMT5PLkZrwb12BWzxh4dB3HUd89FOSrg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Thank you for working on this.

I tried the patch with a Turkish dictionary [1] I could find on the
Internet. It worked for some words, but not others:

> hasegeli=# create text search dictionary hunspell_tr (template = ispell, dictfile = tr, afffile = tr);
> CREATE TEXT SEARCH DICTIONARY
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilki'); -- The root "fox"
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkinin'); -- Genitive form, affix 3290
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiler'); -- Plural form, affix 4371
> ts_lexize
> -----------
> {tilki}
> (1 row)
>
> hasegeli=# select ts_lexize('hunspell_tr', 'tilkiyi'); -- Accusative form, affix 2646
> ts_lexize
> -----------
>
> (1 row)

It seems to have something to do with the order of the affixes. It
works, if I move affix 2646 to the beginning of the list.

[1] https://tr-spell.googlecode.com/files/dict_aff_5000_suffix_1130000_words.zip

In response to

Re: [PROPOSAL] Improvements of Hunspell dictionaries support at 2015-11-06 09:33:43 from Artur Zakirov

Responses

Re: [PROPOSAL] Improvements of Hunspell dictionaries support at 2015-11-08 11:23:30 from Artur Zakirov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Vitaly Burovoy	2015-11-07 14:47:17	Extracting fields from 'infinity'::TIMESTAMP[TZ]
Previous Message	Amit Kapila	2015-11-07 14:16:35	Re: Transactions involving multiple postgres foreign servers