Quick Links

Re: client libpq multibyte support

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	peter_e(at)gmx(dot)net, e99re41(at)DoCS(dot)UU(dot)SE
Cc:	tgl(at)sss(dot)pgh(dot)pa(dot)us, sakaida(at)psn(dot)co(dot)jp, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: client libpq multibyte support
Date:	2000-05-05 10:36:18
Message-ID:	20000505193618U.t-ishii@sra.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> > That's because none-MB client does not understand how "Shift JIS
> > kanji" consists of letters with different width bytes. The similar
> > problem would happen with the Big5 character set (traditional
> > Chinese), also. Unlike other character sets, these should be treated
> > carefully since they include the same bit patterns as ASCII and that
> > makes none-MB clients confused.
>
> I'm confused though, this would mean that somewhere in the string
> `SJIS_KANJI' a backslash was found. But that's all ASCII characters.
> Aren't the characters 0-127 always identical in any character set?

Not always. Shift JIS and Big5 include 0-127 characters. So "how to
distinguish them from ASCII?", you might ask. Here are rules for this:

1. parse from the begining byte of the string in question. If it is
0-127 then it's an ASCII (single byte letter).

2. if it's between 0xa1 and 0xdf, it's a "1 byte kana" (single byte
letter).

3. otherwise it's a "kanji" (double byte letter). In this case the
second byte might be in range of 0-127 (this is the source of the
problem).

I think Big5 has similar, but a little bit different rule (I don't
remember precisely now).

Other encodings having 0-127 range bytes (but they are not ASCII)
include:

o UCS-2, 4 (Unicode)

o any 7 bit encoded ISO 2022 based charsets. for example, ISO 2022-jp.
--
Tatsuo Ishii

Responses

Re: client libpq multibyte support at 2000-05-05 16:06:26 from SAKAIDA Masaaki

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Brian E. Gallew	2000-05-05 12:43:27	initdb problems
Previous Message	Karel Zak	2000-05-05 10:24:50	Re: suggestion: docs and psql