From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Invalid byte sequence for encoding "UTF8", caused due to non wide-char-aware downcase_truncate_identifier() function on WINDOWS |
Date: | 2011-06-09 17:22:42 |
Message-ID: | 11025.1307640162@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Thu, Jun 9, 2011 at 11:17 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hmm ... while the above is easy enough to do in the backend, where we
>> can look at pg_database_encoding_max_length, we have also got instances
>> of this coding pattern in src/port/pgstrcasecmp.c. It's a lot less
>> obvious how to make the test in frontend environments. Thoughts anyone?
> I'm not sure if this helps at all, but an awful lot of those tests are
> against hard-coded strings that are known to contain only ASCII
> characters. Is there some way we can optimize this for that case?
For the places where we're just looking for a match to a fixed all-ASCII
string, an ASCII-only downcasing would be sufficient, and would
eliminate the whole problem. But I doubt all the callers fall into that
class.
What I'm particularly worried about at the moment is whether we are
assuming anywhere that the frontend side can duplicate the backend's
identifier downcasing behavior. That seems like a complete morass,
because (1) they might not have the same locale, (2) they might not
have the same encoding, (3) even if they do, the "same" locale is known
to behave differently on different platforms.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jaime Casanova | 2011-06-09 17:26:53 | wrong message on REASSIGN OWNED |
Previous Message | Bruce Momjian | 2011-06-09 17:18:30 | Re: procpid? |