Re: Escaped backslash in SQL constant

From: "CN" <cnliou9(at)fastmail(dot)fm>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Escaped backslash in SQL constant
Date: 2005-12-24 04:14:19
Message-ID: 1135397659.19476.250481019@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> No, I'm suggesting that it shouldn't be let loose on Big5 data when it
> evidently hasn't a clue about that encoding. The byte in question
> *is not* a backslash, it's not even an independent character; and so
> changing it on the assumption that it is logically a backslash simply
> breaks the data.

Would you please enlighten me the behavior of the backend - why

SET CLIENT_ENCODING TO Big5;
INSERT INTO y VALUES ('A\134B');

stores

A\B

while

INSERT INTO y VALUES ('y\134na');
--"y\" and "na" are two Big5 characters.

stores

y\134na

instead of

y\na

> Your quickest route to a solution may be to avoid Big5 in favor of
> an encoding that is ASCII-safe, such as UTF8. You can feed that through
> code that only understands ASCII with much less risk than an encoding
> where second and later bytes might look like ASCII.

Are you suggesting me to implement the middleware that will translate
Big5 input to UTF8 and then escape the latter before sending it to
PostgreSQL?

SET CLIENT_ENCODING TO UTF8;
[BIG5 string from user] --> [middleware] --> [UTF8] --> [escaped UTF8]
--> PostgreSQL (initdb with -E UNICODE)

Best regards,

CN

--
http://www.fastmail.fm - Does exactly what it says on the tin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-12-24 04:24:43 Re: Escaped backslash in SQL constant
Previous Message Tom Lane 2005-12-24 03:41:01 Re: Escaped backslash in SQL constant