From: | Wolfgang Winkler <wolfgang(dot)winkler(at)digital-concepts(dot)com> |
---|---|
To: | Artur Zakirov <a(dot)zakirov(at)postgrespro(dot)ru>, obartunov(at)gmail(dot)com |
Cc: | Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Using a german affix file for compound words |
Date: | 2016-01-28 17:36:15 |
Message-ID: | 56AA518F.6070904@digital-concepts.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I'm using 9.4.5 as well and I used exactly the same iconv lines as you
postes below.
Are there any encoding options that have to be set right? The database
encoding is set to UTF8.
ww
Am 2016-01-28 um 17:34 schrieb Artur Zakirov:
> On 28.01.2016 18:57, Oleg Bartunov wrote:
>>
>>
>> On Thu, Jan 28, 2016 at 6:04 PM, Wolfgang Winkler
>> <wolfgang(dot)winkler(at)digital-concepts(dot)com
>> <mailto:wolfgang(dot)winkler(at)digital-concepts(dot)com>> wrote:
>>
>> Hi!
>>
>> We have a problem with importing a compound dictionary file for
>> german.
>>
>> I downloaded the files here:
>>
>> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
>>
>> and converted them to utf-8 with iconv. The affix file seems ok when
>> opened with an editor.
>>
>> When I try to create or alter a dictionary to use this affix file, I
>> get the following error:
>>
>> alter TEXT SEARCH DICTIONARY german_ispell (
>> DictFile = german,
>> AffFile = german,
>> StopWords = german
>> );
>> ERROR: syntax error
>> CONTEXT: line 224 of configuration file
>> "/usr/local/pgsql/share/tsearch_data/german.affix": " ABE >
>> -ABE,äBIN
>> "
>>
>> This is the first occurrence of an umlaut character in the file.
>> I've found a view postings where the same file is used, e.g.:
>>
>> http://www.postgresql.org/message-id/flat/556C1411(dot)4010608(at)tbz-pariv(dot)de#556C1411(dot)4010608@tbz-pariv.de
>>
>> This users has been able to import the file. Am I missing something
>> obvious?
>>
>
> What version of PostgreSQL do you use?
>
> I tested this dictionary on PostgreSQL 9.4.5. Downloaded from the link
> files and executed commands:
>
> iconv -f ISO-8859-1 -t UTF-8 german.aff -o german2.affix
> iconv -f ISO-8859-1 -t UTF-8 german.dict -o german2.dict
>
> I renamed them to german.affix and german.dict and moved to the
> tsearch_data directory. Executed commands without errors:
>
> -> create text search dictionary german_ispell (
> Template = ispell,
> DictFile = german,
> AffFile = german,
> Stopwords = german
> );
> DROP TEXT SEARCH DICTIONARY
>
> -> select ts_lexize('german_ispell', 'test');
> ts_lexize
> -----------
> {test}
> (1 row)
>
--
*Wolfgang Winkler*
Geschäftsführung
wolfgang(dot)winkler(at)digital-concepts(dot)com
mobil +43.699.19971172
dc:*büro*
digital concepts Novak Winkler OG
Software & Design
Landstraße 68, 5. Stock, 4020 Linz
www.digital-concepts.com <http://www.digital-concepts.com>
tel +43.732.997117.72
tel +43.699.1997117.2
Firmenbuchnummer: 192003h
Firmenbuchgericht: Landesgericht Linz
From | Date | Subject | |
---|---|---|---|
Next Message | Melvin Davidson | 2016-01-28 17:41:08 | Re: BRIN indexes |
Previous Message | David Rowley | 2016-01-28 17:31:36 | Re: BRIN indexes |