Re: Invalid EUC_TW character sequence found

From: Gene Leung <gene(at)regaltronic(dot)com>
To: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc: pgsql-bugs(at)postgresql(dot)org, gordon(at)gforce(dot)ods(dot)org
Subject: Re: Invalid EUC_TW character sequence found
Date: 2002-06-26 02:06:38
Message-ID: 3D1921AE.E61FFB71@regaltronic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi Tatsuoi,

Thanks for your quick response. Actually I tried both way (1. dump and
restore, 2. create a new database in version 7.2.1) but in vain.

The first way is to dump a database from 7.0.2 database containing EUC_TW
data

List of databases
Database | Owner | Encoding
---------------+----------+----------
AccessControl | postgres | EUC_TW

The old database was created by the EUC_TW encoding. It works fine with the
chinese characters for version 7.0.2. However when I follow the
instruction to do the upgrade with restore to my redhat 6.1, it gives error
such as Invalid EUC_TW character sequence found.

Then I search for the news group, with "Invalid EUC_TW character sequence
found", a guy named Gordon Luk has the same problem as me. Actually he is
my friend, originally I thought it may be the problem of Redhat 7.3 with
postgresql pre-installed. So I decided to try with the tar file and did the
installation to Redhat 6.1.

The second way to confirm version 7.2.1 can not accept chinese input is to
create a new database with the following command:

CREATE DATABASE "test" WITH ENCODING = 'EUC_TW';

then create table site (name varchar(50)); and insert data directly with
pgAdmin II, it gives error as follows:

2002-06-26 09:22:28 - SQL (test): INSERT INTO "site" ("name") VALUES
('­»´ä¦r')

2002-06-26 09:22:28 -
*******************************************************************
2002-06-26 09:22:28 - Error
2002-06-26 09:22:28 -
*******************************************************************
2002-06-26 09:22:28 - Error in pgAdmin II:frmSQLOutput.cmdSave_Click:
-2147467259 - ERROR: Invalid EUC_TW character sequence found (0xa672)
2002-06-26 09:22:28 - Windows Version: Windows 2000 v5.0 build 2195 Service
Pack 2
2002-06-26 09:22:28 - pgSchema Version: 1.2.0
2002-06-26 09:22:28 - MDAC Version: 2.5
2002-06-26 09:22:28 - DBMS Version: 07.02.0001 PostgreSQL 7.2.1 on
i686-pc-linux-gnu, compiled by GCC egcs-2.91.66
2002-06-26 09:22:28 - Connection String (Master Connection):
Provider=MSDASQL.1;Extended
Properties="DRIVER={PostgreSQL};DATABASE=template1;SERVER=sql;PORT=5432;UID=harry;PWD=********;ReadOnly=0;Protocol=6.4;FakeOidIndex=0;ShowOidColumn=0;RowVersioning=0;ShowSystemTables=0;ConnSettings=;Fetch=100;Socket=4096;UnknownSizes=0;MaxVarcharSize=254;MaxLongVarcharSize=65536;Debug=0;CommLog=0;Optimizer=1;Ksqo=1;UseDeclareFetch=0;TextAsLongVarchar=1;UnknownsAsLongVarchar=1;BoolsAsChar=1;Parse=0;CancelAsFreeStmt=0;ExtraSysTablePrefixes=dd_;LFConversion=1;UpdatableCursors=1;DisallowPremature=0;TrueIsMinus1=0"

If the coming version can not support chinese, it may be a big problem for a
lot of people. As a database user myself, we do not have much knowledge
about those encoding stuff. And we have to rely on you guys. You guys have
already done a lot of good things to the open source. Just keep on
searching the best.

Thanks!

Best Regards
Gene Leung

Tatsuo Ishii wrote:

> > -2147467259 - ERROR: Invalid EUC_TW character sequence found (0xa672)
>
> The error message says all. You had invalid data (maybe raw Big5
> data?) in your database.
>
> (1) If you are sure you have raw Big5 data in the old database,
> convert them to EUC_TW then load them.
>
> (2) If you have EUC_TW and Big5 mixed data, then you have a serious
> problem. You probably have to fix the the dump data by hand.
> --
> Tatsuo Ishii

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tatsuo Ishii 2002-06-26 02:13:41 Re: Invalid EUC_TW character sequence found
Previous Message Neil Conway 2002-06-26 01:52:38 Re: "Field is too small"