Quick Links

Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS
Date:	2011-06-09 17:22:42
Message-ID:	11025.1307640162@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jun 9, 2011 at 11:17 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hmm ... while the above is easy enough to do in the backend, where we
>> can look at pg_database_encoding_max_length, we have also got instances
>> of this coding pattern in src/port/pgstrcasecmp.c. It's a lot less
>> obvious how to make the test in frontend environments. Thoughts anyone?

> I'm not sure if this helps at all, but an awful lot of those tests are
> against hard-coded strings that are known to contain only ASCII
> characters. Is there some way we can optimize this for that case?

For the places where we're just looking for a match to a fixed all-ASCII
string, an ASCII-only downcasing would be sufficient, and would
eliminate the whole problem. But I doubt all the callers fall into that
class.

What I'm particularly worried about at the moment is whether we are
assuming anywhere that the frontend side can duplicate the backend's
identifier downcasing behavior. That seems like a complete morass,
because (1) they might not have the same locale, (2) they might not
have the same encoding, (3) even if they do, the "same" locale is known
to behave differently on different platforms.

regards, tom lane

In response to

Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS at 2011-06-09 15:28:20 from Robert Haas

Responses

Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS at 2011-06-09 17:55:02 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jaime Casanova	2011-06-09 17:26:53	wrong message on REASSIGN OWNED
Previous Message	Bruce Momjian	2011-06-09 17:18:30	Re: procpid?