From: | Han Parker <parker(dot)han(at)outlook(dot)com> |
---|---|
To: | Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | 回复: 回复: May "PostgreSQL server side GB18030 character set support" reconsidered? |
Date: | 2020-10-06 03:13:06 |
Message-ID: | ME2PR01MB2532B5E0BA83A3AD5C2B80988A0D0@ME2PR01MB2532.ausprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
________________________________
发件人: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
发送时间: 2020年10月6日 2:15
收件人: tgl(at)sss(dot)pgh(dot)pa(dot)us <tgl(at)sss(dot)pgh(dot)pa(dot)us>
抄送: parker(dot)han(at)outlook(dot)com <parker(dot)han(at)outlook(dot)com>; pgsql-general(at)postgresql(dot)org <pgsql-general(at)postgresql(dot)org>
主题: Re: 回复: May "PostgreSQL server side GB18030 character set support" reconsidered?
> Hmm ... interesting idea, basically invent our own modified version
> of GB18030 (or SJIS?) for backend-internal storage. But I'm not
> sure how to make it work without enlarging the string, which'd defeat
> the OP's argument. It looks to me like the second-byte code space is
> already pretty full in both encodings.
>But as he already admitted, actually GB18030 is 4 byte encoding, rather
>than 2 bytes. So maybe we could find a way to map original GB18030 to
>ASCII-safe GB18030 using 4 bytes.>
>As for SJIS, no big demand for the encoding in Japan these days. So I
>think we can leave it as it is.>
>Best regards,
>--
>Tatsuo Ishii
>SRA OSS, Inc. Japan
>English: http://www.sraoss.co.jp/index_en.php
>Japanese:http://www.sraoss.co.jp
So the key lies in a ASCII-safe GB18030 simple mapping algorithm (Maybe named with abbreviation "GB18030as" of GB18030_ascii_safe?), which not break "ASCII-safe" while save lots of storage (The ANSI-safe GB2312 contains most frequently used 6763 characters).
In fact, it was GBK designed by Microsoft broke "ASCII-safe" in about 1995 with the popular of Win95. Later GB18030 inherited it because it had to compatible with GBK.
Thanks.
I will try to find whether any opinions regarding "a ASCII-safe GB18030 simple mapping algorithm" exist in GB18030 standard maintainers community.
From | Date | Subject | |
---|---|---|---|
Next Message | Hemil Ruparel | 2020-10-06 05:37:47 | How to update a table with the result of deleting rows in another table |
Previous Message | Tatsuo Ishii | 2020-10-06 03:11:42 | Re: 回复: May "PostgreSQL server side GB18030 character set support" reconsidered? |