From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | J Smith <dark(dot)panda+lists(at)gmail(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: unaccent extension missing some accents |
Date: | 2011-11-10 21:15:34 |
Message-ID: | 201111102115.pAALFYx26956@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> J Smith <dark(dot)panda+lists(at)gmail(dot)com> writes:
> > I've attached a patch against master for unaccent.c that uses swscanf
> > along with char2wchar and wchar2char instead of sscanf directly to
> > initialize the unaccent extension and it appears to fix the problem in
> > both the master and 9.1 branches.
>
> swscanf doesn't seem like an acceptable approach: it's a function that
> is relied on nowhere else in PG, so it adds new portability risks of its
> own. It doesn't exist on some platforms that we support (like the one
> I'm typing this message on) and there's no real good reason to assume
> that it's not broken in its own ways on others.
>
> If you really want to pursue this, I'd suggest parsing the line
> manually, perhaps via strchr searches for \t and \n. It likely wouldn't
> be very many more lines than what you've got here.
>
> However, the bigger picture is that OS X's UTF8 locales are broken
> through-and-through, and most of their other problems are not feasible
> to work around. So basically you can't use them for anything
> interesting, and it's not clear that it's worth putting any time into
> solving individual problems. In the particular case here, the issue
> presumably is that sscanf is relying on isspace() ... but we rely on
> isspace() directly, in quite a lot of places, so how much is it going
> to fix to dodge it right here?
If Apple's low-level code came from FreeBSD and NetBSD, how did they get
so broken?
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-11-10 21:18:46 | Re: unaccent extension missing some accents |
Previous Message | Bruce Momjian | 2011-11-10 21:10:48 | Re: const correctness |