| From: | "PostgreSQL Bugs List" <pgsql-bugs(at)postgresql(dot)org> |
|---|---|
| To: | pgsql-bugs(at)postgresql(dot)org |
| Subject: | BUG #1091: Localization in EUC_TW Can't decode Big5 0xFA40--0xFEF0. |
| Date: | 2004-03-04 02:08:47 |
| Message-ID: | 20040304020847.E10A2CF4D3A@www.postgresql.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 1091
Logged by: yychen
Email address: yychen(at)mail(dot)clhs(dot)tyc(dot)edu(dot)tw
PostgreSQL version: 7.4
Operating system: MS-WIN2000(Run With TAIWAN Big5)
Description: Localization in EUC_TW Can't decode Big5
0xFA40--0xFEF0.
Details:
In Localization:
DataBase
When i save string (with Big5 0xFA40-0xFEF0) to database (encodinig with
EUC_TW or UNICODE); and then read it.
But PostgreSQL Can't decode these.
According to: ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf.
3.3.4: BIG FIVE
Big Five is the encoding system used on machines that support
MS-DOS or Windows, and also for Macintosh (such as the Chinese
Language Kit or the fully-localized operating system).
Two-byte Standard Characters Encoding Ranges
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
first byte range 0xA1-0xFE
second byte ranges 0x40-0x7E, 0xA1-0xFE
One-byte Characters Encoding Range
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
ASCII 0x21-0x7E
The encoding used on Macintosh is quite similar to the above,
but has a slightly shortened two-byte range (second byte range up to
0xFC only) plus additional one-byte code points, namely 0x80
(backslash), 0xFD ("copyright" symbol: "c" in a circle), 0xFE
("trademark" symbol: "TM" as a superscript), and 0xFF ("ellipsis"
symbol: three dots).
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tatsuo Ishii | 2004-03-04 03:09:45 | Re: BUG #1091: Localization in EUC_TW Can't decode Big5 |
| Previous Message | Steve Atkins | 2004-03-03 23:34:47 | Re: Integer parsing bug? |