From: | Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: BUG #3730: Creating a swedish dictionary fails |
Date: | 2007-11-09 00:44:49 |
Message-ID: | 20071109004449.GA65896@picard.dgc.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Thu, Nov 08, 2007 at 05:21:17PM -0500, Tom Lane wrote:
> Penty Wenngren <penty(dot)wenngren(at)dgc(dot)se> writes:
> > I used iconv to convert svenska.aff and svenska.datalist (from
> > iswedish-1.2.1) to UTF-8. The converted files can be found at:
> > http://www.lederhosen.org/swedish.affix
> > http://www.lederhosen.org/swedish.dict
>
> I think the reason it's failing right there is that that line is the
> first affix rule containing a non-ASCII letter, and the rules are
> supposed to only contain letters and certain specific punctuation.
> I suspect you are working in a locale that doesn't think à is a
> letter --- check lc_ctype.
>
It doesn't seem to make any difference. The first try was done from a
terminal that didn't care much for UTF-8, but that is fixed now and I
still get the same result. Could it be that iconv's conversion is
broken then, or that I did something terribly wrong in the conversion
process (iconv -f ISO-8859-1 -t UTF-8 svenska.aff > swedish.affix)?
$ echo $LANG
sv_SE.UTF-8
$ echo $LC_CTYPE
sv_SE.UTF-8
$ psql test
Välkommen till psql 8.3beta2, den interaktiva PostgreSQL-terminalen.
Skriv: \copyright för upphovsrättsinformation
\h för hjälp om SQL-kommandon
\? för hjälp om psql-kommandon
\g eller avsluta med semikolon för att köra en fråga
\q för att avsluta
test=# CREATE TEXT SEARCH DICTIONARY swedish_ispell (
TEMPLATE = ispell,
DictFile = swedish,
AffFile = swedish,
StopWords = swedish);
FEL: syntax error at line 219 of affix file
"/usr/local/share/postgresql/tsearch_data/swedish.affix"
I also tried to convert the file again, this time from a terminal that
likes UTF8 thinking that might have an effect, but the affix file looks
the same.
I found a post in the archives regarding a similar problem:
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00825.php
It seems editing the affix file and manually removing some lines at
least partially solved the problem in that case.
// Penty
--
Penty Wenngren
DGC Solutions AB
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-11-09 01:45:32 | Re: BUG #3730: Creating a swedish dictionary fails |
Previous Message | Tom Lane | 2007-11-09 00:10:50 | Re: BUG #3723: dropping an index that doesn't refer to table's columns |