pgsql: Fix some wide-character bugs in the text-search parser.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix some wide-character bugs in the text-search parser.
Date: 2014-02-01 23:28:10
Message-ID: E1W9jys-0004z2-0o@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix some wide-character bugs in the text-search parser.

In p_isdigit and other character class test functions generated by the
p_iswhat macro, the code path for non-C locales with multibyte encodings
contained a bogus pointer cast that would accidentally fail to malfunction
if types wchar_t and wint_t have the same width. Apparently that is true
on most platforms, but not on recent Cygwin releases. Remove the cast,
as it seems completely unnecessary (I think it arose from a false analogy
to the need to cast to unsigned char when dealing with the <ctype.h>
functions). Per bug #8970 from Marco Atzeri.

In the same functions, the code path for C locale with a multibyte encoding
simply ANDed each wide character with 0xFF before passing it to the
corresponding <ctype.h> function. This could result in false positive
answers for some non-ASCII characters, so use a range test instead.
Noted by me while investigating Marco's complaint.

Also, remove some useless though not actually buggy maskings and casts
in the hand-coded p_isalnum and p_isalpha functions, which evidently
got tested a bit more carefully than the macro-generated functions.

Branch
------
REL8_4_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/56f5d3424ed8e8c32a239ec8b7fee13beed6a24b

Modified Files
--------------
src/backend/tsearch/wparser_def.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2014-02-02 00:14:23 Re: pgsql: Copy the libpq DLL to the bin directory on Mingw and Cygwin.
Previous Message Tom Lane 2014-02-01 23:28:09 pgsql: Fix some wide-character bugs in the text-search parser.