Quick Links

Re: [rfc] unicode escapes for extended strings

From:	Andrew Dunstan <andrew(at)dunslane(dot)net>
To:	Marko Kreen <markokr(at)gmail(dot)com>
Cc:	"tomas(at)tuxteam(dot)de" <tomas(at)tuxteam(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [rfc] unicode escapes for extended strings
Date:	2009-09-25 12:37:50
Message-ID:	4ABCB99E.6@dunslane.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Marko Kreen wrote:
> On 9/25/09, tomas(at)tuxteam(dot)de <tomas(at)tuxteam(dot)de> wrote:
>
>> On Thu, Sep 24, 2009 at 09:42:32PM +0300, Peter Eisentraut wrote:
>> > Good idea. This could also check for other invalid things like
>> > byte-order marks in UTF-8.
>>
>> But watch out. Microsoft apps do like to insert a BOM at the beginning
>> of the text. Not that I think it's a good idea, but the Unicode folks
>> seem to think its OK [1] :-(
>>
>
> As BOM does not actively break transport layers, it's less clear-cut
> whether to reject it. It could be said that BOM at the start of string
> is OK. BOM at the middle of string is more rejectable. But it will
> only confuse some high-level character counters, not low-level encoders.
>
>

It seems pretty clear from the URL that Tomas posted that we should not
treat a BOM specially at all, and just treat it as another Unicode char.

cheers

andrew

In response to

Re: [rfc] unicode escapes for extended strings at 2009-09-25 09:27:41 from Marko Kreen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Heikki Linnakangas	2009-09-25 12:50:52	Re: Hot Standby 0.2.1
Previous Message	Heikki Linnakangas	2009-09-25 11:00:52	Re: Hot Standby 0.2.1