Re: Multibyte still broken

From: Michael Robinson <robinson(at)netrinsics(dot)com>
To: robinson(at)netrinsics(dot)com, t-ishii(at)sra(dot)co(dot)jp
Cc: pgsql-hackers(at)hub(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: Multibyte still broken
Date: 2000-05-11 17:56:15
Message-ID: 200005111756.BAA10220@netrinsics.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> writes:
>I am supprised to hear that you have so poor quality tools that
>produce illegal code sequences of Simplified Chinese. In Japan, as far
>as I know, we never have such a low quality tools which generate
>illegal Japanese charaters just because they are not accepted in the
>market, even in the case of email attachments, or cut-and-past or
>whatever.

The problem is not that the tools produce "illegal characters". The problem
is that, as an EUC code, GB permits the coexistance of standard ascii
characters with double-byte hanzi characters. Furthermore, most Chinese
software is an operating-system "hack" on top of English-language software
based on a Latin-1 character set (the Chinese software market is underserved
compared to Japan, so we have to cope as best we can).

The result is that it is possible to, for example, insert a carriage return
or ASCII comma into the middle of a hanzi, which breaks the alignment for all
the hanzi on the rest of the line. It's also possible, in non-native Chinese
applications, to select one byte of a hanzi character in a cut or copy
operation.

So the problem is that the tools do not uniformly respect the integrity of
a double-byte hanzi character, but rather treat it as two individual Latin-1
characters.

The important point, though, is that all tools, whether native Chinese or
"hacked" English, accept the resulting invalid code sequences consistently,
robustly, and without complaint.

-Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-05-11 18:02:07 Re: Multibyte still broken
Previous Message Marc G. Fournier 2000-05-11 17:54:56 Re: User's Lounge and Developer's Corner