Quick Links

Re: UNICODE characters above 0x10000

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"John Hansen" <john(at)geeknet(dot)com(dot)au>
Cc:	"Hackers" <pgsql-hackers(at)postgresql(dot)org>, "Patches" <pgsql-patches(at)postgresql(dot)org>
Subject:	Re: UNICODE characters above 0x10000
Date:	2004-08-07 05:06:30
Message-ID:	26451.1091855190@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches

"John Hansen" <john(at)geeknet(dot)com(dot)au> writes:
> My apologies for not reading the code properly.

> Attached patch using pg_utf_mblen() instead of an indexed table.
> It now also do bounds checks.

I think you missed my point. If we don't need this limitation, the
correct patch is simply to delete the whole check (ie, delete lines
827-836 of wchar.c, and for that matter we'd then not need the encoding
local variable). What's really at stake here is whether anything else
breaks if we do that. What else, if anything, assumes that UTF
characters are not more than 2 bytes?

Now it's entirely possible that the underlying support is a few bricks
shy of a load --- for instance I see that pg_utf_mblen thinks there are
no UTF8 codes longer than 3 bytes whereas your code goes to 4. I'm not
an expert on this stuff, so I don't know what the UTF8 spec actually
says. But I do think you are fixing the code at the wrong level.

regards, tom lane

In response to

Re: UNICODE characters above 0x10000 at 2004-08-07 03:04:21 from John Hansen

Responses

Re: UNICODE characters above 0x10000 at 2004-08-07 05:55:44 from Oliver Elphick
Re: UNICODE characters above 0x10000 at 2004-08-07 06:27:31 from Dennis Bjorklund
Re: UNICODE characters above 0x10000 at 2004-08-07 10:47:07 from Christopher Kings-Lynne

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Hansen	2004-08-07 05:34:59	Re: UNICODE characters above 0x10000
Previous Message	Tom Lane	2004-08-07 04:37:48	Re: [PATCHES] [BUGS] casting strings to multidimensional arrays yields strange

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Tom Lane	2004-08-07 05:27:17	Re: Minor BEFORE DELETE trigger fix
Previous Message	Gavin Sherry	2004-08-07 05:03:31	Re: Minor BEFORE DELETE trigger fix