Quick Links

Re: [PATCHES] UNICODE characters above 0x10000

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	oliver(at)opencloud(dot)com
Cc:	tgl(at)sss(dot)pgh(dot)pa(dot)us, db(at)zigo(dot)dhs(dot)org, john(at)geeknet(dot)com(dot)au, pgsql-hackers(at)postgresql(dot)org, pgsql-patches(at)postgresql(dot)org
Subject:	Re: [PATCHES] UNICODE characters above 0x10000
Date:	2004-08-08 02:17:59
Message-ID:	20040808.111759.02304017.t-ishii@sra.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

> Tom Lane wrote:
>
> > If I understood what I was reading, this would take several things:
> > * Remove the "special UTF-8 check" in pg_verifymbstr;
> > * Extend pg_utf2wchar_with_len and pg_utf_mblen to handle the 4-byte case;
> > * Set maxmblen to 4 in the pg_wchar_table[] entry for UTF-8.
> >
> > Are there any other places that would have to change? Would this break
> > anything? The testing aspect is what's bothering me at the moment.
>
> Does this change what client_encoding = UNICODE might produce? The JDBC
> driver will need some tweaking to handle this -- Java uses UTF-16
> internally and I think some supplementary character (?) scheme for
> values above 0xffff as of JDK 1.5.

Java doesn't handle UCS above 0xffff? I didn't know that. As long as
you put in/out JDBC, it shouldn't be a problem. However if other APIs
put in such a data, you will get into trouble...
--
Tatsuo Ishii

In response to

Re: [PATCHES] UNICODE characters above 0x10000 at 2004-08-08 00:14:33 from Oliver Jowett

Responses

Re: [PATCHES] UNICODE characters above 0x10000 at 2004-08-08 02:35:54 from Oliver Jowett

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2004-08-08 02:25:04	Re: beta time
Previous Message	Bruce Momjian	2004-08-08 02:08:51	beta time

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2004-08-08 02:28:50	Re: UNICODE characters above 0x10000
Previous Message	Tom Lane	2004-08-08 02:08:08	Re: Tutorial patch