From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Marko Kreen <markokr(at)gmail(dot)com> |
Cc: | "tomas(at)tuxteam(dot)de" <tomas(at)tuxteam(dot)de>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [rfc] unicode escapes for extended strings |
Date: | 2009-09-25 12:37:50 |
Message-ID: | 4ABCB99E.6@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Marko Kreen wrote:
> On 9/25/09, tomas(at)tuxteam(dot)de <tomas(at)tuxteam(dot)de> wrote:
>
>> On Thu, Sep 24, 2009 at 09:42:32PM +0300, Peter Eisentraut wrote:
>> > Good idea. This could also check for other invalid things like
>> > byte-order marks in UTF-8.
>>
>> But watch out. Microsoft apps do like to insert a BOM at the beginning
>> of the text. Not that I think it's a good idea, but the Unicode folks
>> seem to think its OK [1] :-(
>>
>
> As BOM does not actively break transport layers, it's less clear-cut
> whether to reject it. It could be said that BOM at the start of string
> is OK. BOM at the middle of string is more rejectable. But it will
> only confuse some high-level character counters, not low-level encoders.
>
>
It seems pretty clear from the URL that Tomas posted that we should not
treat a BOM specially at all, and just treat it as another Unicode char.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2009-09-25 12:50:52 | Re: Hot Standby 0.2.1 |
Previous Message | Heikki Linnakangas | 2009-09-25 11:00:52 | Re: Hot Standby 0.2.1 |