Re: Text search dictionary vs. the C locale

From: Gmail <robjsargent(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: twoflower <standa(dot)kurik(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org, Oleg Bartunov <obartunov(at)gmail(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Text search dictionary vs. the C locale
Date: 2017-07-02 17:11:12
Message-ID: AC6859E0-28C0-410B-A823-A318B57C4A47@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sent from my iPad

> On Jul 2, 2017, at 10:06 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> twoflower <standa(dot)kurik(at)gmail(dot)com> writes:
>> I am having problems creating an Ispell-based text search dictionary for
>> Czech language.
>
>> Issuing the following command:
>
>> create text search dictionary czech_ispell (
>> template = ispell,
>> dictfile = czech_ispell,
>> affFile = czech_ispell
>> );
>
>> ends with
>
>> ERROR: syntax error
>> CONTEXT: line 252 of configuration file
>> "/usr/share/postgresql/9.6/tsearch_data/czech_ispell.affix": " . > TŘIA
>
>> The dictionary files are in UTF-8. The database cluster was initialized with
>> initdb --locale=C --encoding=UTF8
>
> Presumably the problem is that the dictionary file parsing functions
> reject anything that doesn't satisfy t_isalpha() (unless it matches
> t_isspace()) and in C locale that's not going to accept very much.
>
> I wonder why we're doing it like that. It seems like it'd often be
> useful to load dictionary files that don't match the database's
> prevailing locale. Do we really need the t_isalpha tests, or would
> it be good enough to assume that anything that isn't t_isspace is
> part of a word?
>
> regards, tom lane
>
What about punctuation?
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

In response to

Browse pgsql-general by date

  From Date Subject
Next Message rajan 2017-07-02 17:16:50 Re: Need help on compiling postgres source code from cloned repo
Previous Message Tom Lane 2017-07-02 16:06:49 Re: Text search dictionary vs. the C locale