Quick Links

Re: UNICODE characters above 0x10000

From:	Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	John Hansen <john(at)geeknet(dot)com(dot)au>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: UNICODE characters above 0x10000
Date:	2004-08-07 07:01:37
Message-ID:	Pine.LNX.4.44.0408070851550.9559-100000@zigo.dhs.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

On Sat, 7 Aug 2004, Tom Lane wrote:

> question at hand is whether we can support 32-bit characters or not ---
> and if not, what's the next bug to fix?

True, and that's hard to just give an answer to. One could do some simple
testing, make sure regexps work and then treat anything else that might
not work, as bugs to be fixed later on when found.

The alternative is to inspect all code paths that involve strings, not fun
at all :-)

My previous mail talked about utf-8 translation. Not all characters
possible to form using utf-8 are assigned by the unicode org. However,
the part that interprets the unicode strings are in the os so different
os'es can give different results. So I think pg should just accept even 6
byte utf-8 sequences even if some characters are not currently assigned.

--
/Dennis Björklund

In response to

Re: UNICODE characters above 0x10000 at 2004-08-07 06:49:06 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Hansen	2004-08-07 07:59:39	Re: UNICODE characters above 0x10000
Previous Message	Tom Lane	2004-08-07 06:49:06	Re: UNICODE characters above 0x10000

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2004-08-07 07:03:23	Re: Minor BEFORE DELETE trigger fix
Previous Message	Tom Lane	2004-08-07 06:49:06	Re: UNICODE characters above 0x10000