From: | "Richard So" <richso(at)i-cable(dot)com> |
---|---|
To: | <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Re: Multi-byte character bug (resend for clarify) |
Date: | 2002-07-30 18:35:05 |
Message-ID: | 000e01c237f7$d34799d0$0a00a8c0@netrogen.local |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
>> Two bugs has been found in the SQL parser and Multibyte char support:
>>
>What is the encoding for "chinese char"? You need to give us more info.
By Chinese here, I mean BIG5 encoding character which is a widely used
encoding in HK and Taiwan. My setup:
Db encoding: EUC_TW
Client (JDBC / ODBC) Encoding: BIG5
JDBC: I supplied the parameter 'charSet=Big5' to the
connection string
ODBC: my locale (Chinese Win2000 machine) is Chinese
Taiwan
Client application: Tomcat4 jsp page (see the attached)
App / Db Server: Redhat 7.3 Linux + postgresql (set) 7.2.1-2PGDG
(download binary rpm) + Tomcat4
App / DB Server locale: zh_TW.Big5
JDBC driver: pgjdbc2.jar
Client Machine: Win2000 Chinese (Taiwan) Version with SP2 + I.E.
(jsp) + Delphi SQL Explorer (ODBC)
Client Machine locale: Chinese (Taiwan)
>> 1. 'Problem connecting to database: java.sql.SQLException:
ERROR:
>> Invalid EUC_TW character sequence found (0xb27a)' was reported in
>>using JDBC driver to insert record, similar error reported when using
>>ODBC driver and psql, since auto-conversion from client to server
>>should convert the charcter to a valid EUC_TW char, therefore this is
>>a bug
>How did you set the auto-conversion settings for psql? I suspect you
>did something wrong with it.
I've done a new check on it, I found JDBC and ODBC driver still report
the error message but psql do not (may be as you said, I've done a wrong
procedure). However, the problem still there: why JDBC and ODBC still
report the error ? I just tried some Chinese words, but there may be
some of other character will also cause the problem.
I know Tomcat4 default will return the request parameters in ISO-8859
and therefore I've added code
<%@ page contentType="text/html; charset=Big5"%>
<%
request.setCharacterEncoding("BIG5");
%>
to the JSP page and dump the actual SQL posted to postgresql server to
make sure the SQL is correct and its attached (pls see attached file:
offence1.zip).
>> 2. inserting record with xx chinese char, the SQL parser
>>report something like 'Problem connecting to database:
>> java.sql.SQLException: ERROR: parser: parse error at or near
>>"4567891"' (similar in jdbc and odbc), and the error 'unterminated
>>string' has been reported when using psql.
>>
The character code is 0xc05c, in which the second byte is actually a "\"
(back-slash) (pls see the attached file: offence2.zip)
>> I've found the problem exists since 7.1.x till 7.2.*.
Attachment | Content-Type | Size |
---|---|---|
offence1.zip | application/x-zip-compressed | 284 bytes |
offence2.zip | application/x-zip-compressed | 263 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | pgsql-bugs | 2002-07-30 18:47:04 | Bug #724: Problmes creating aggregate functions in 7.2.1 |
Previous Message | Richard So | 2002-07-30 18:25:49 | Re: Multi-byte character bug |