Quick Links

Re: UNICODE characters above 0x10000

From:	Christopher Kings-Lynne <chriskl(at)familyhealth(dot)com(dot)au>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	John Hansen <john(at)geeknet(dot)com(dot)au>, Hackers <pgsql-hackers(at)postgresql(dot)org>, Patches <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: UNICODE characters above 0x10000
Date:	2004-08-07 10:47:07
Message-ID:	4114B32B.8080509@familyhealth.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

> Now it's entirely possible that the underlying support is a few bricks
> shy of a load --- for instance I see that pg_utf_mblen thinks there are
> no UTF8 codes longer than 3 bytes whereas your code goes to 4. I'm not
> an expert on this stuff, so I don't know what the UTF8 spec actually
> says. But I do think you are fixing the code at the wrong level.

Surely there are UTF-8 codes that are at least 3 bytes. I have a
_vague_ recollection that you have to keep escaping and escaping to get
up to like 4 bytes for some asian code points?

Chris

In response to

Re: UNICODE characters above 0x10000 at 2004-08-07 05:06:30 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Hansen	2004-08-07 10:56:03	Re: UNICODE characters above 0x10000
Previous Message	Tatsuo Ishii	2004-08-07 10:46:16	Re: [PATCHES] UNICODE characters above 0x10000

Browse pgsql-patches by date

	From	Date	Subject
Next Message	John Hansen	2004-08-07 10:56:03	Re: UNICODE characters above 0x10000
Previous Message	Tatsuo Ishii	2004-08-07 10:46:16	Re: [PATCHES] UNICODE characters above 0x10000