From: | Alvaro Herrera <alvherre(at)commandprompt(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | penty(dot)wenngren(at)dgc(dot)se, pgsql-bugs(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: BUG #3730: Creating a swedish dictionary fails |
Date: | 2007-11-09 19:10:15 |
Message-ID: | 20071109191015.GC7161@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Tom Lane wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> > I am wondering if the newline being included in the token could be
> > causing a problem.
>
> Nope. I traced through it and the problem is that char2wchar() is
> completely brain-dead: at some places it thinks that "len" is the
> length of the output wchar array, and at others it thinks that "len"
> is the number of bytes in the input. In particular, _t_isalpha()
> fails completely for any multibyte character, because the pnstrdup
> call truncates the character to 1 byte.
Ah, that explains it. I was reading that code too and did not
understand what was going on.
> After looking at the callers I'm inclined to think that the only
> safe way to implement this routine is to change its API to provide
> both counts. Comments?
+1
--
Alvaro Herrera http://www.flickr.com/photos/alvherre/
Licensee shall have no right to use the Licensed Software
for productive or commercial use. (Licencia de StarOffice 6.0 beta)
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2007-11-09 19:12:57 | Re: BUG #3730: Creating a swedish dictionary fails |
Previous Message | Tom Lane | 2007-11-09 18:49:27 | Re: BUG #3730: Creating a swedish dictionary fails |