From: | "Martin Gainty" <mgainty(at)hotmail(dot)com> |
---|---|
To: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pgsql cannot read utf8 files moved from windows correctly! |
Date: | 2007-12-23 16:00:52 |
Message-ID: | BAY108-DAV1359B088D767B65289388BAE580@phx.gbl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
it seems the use of BOM in UTF-8 is discouraged
http://unicode.org/faq/utf_bom.html#BOM
FF FE is UTF16-Little Endian
FE FF is UTF16-Big Endian
Please verify-
Bedankt/
Martin-
----- Original Message -----
From: "Trevor Talbot" <quension(at)gmail(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Sent: Sunday, December 23, 2007 10:39 AM
Subject: Re: [GENERAL] pgsql cannot read utf8 files moved from windows
correctly!
> On 12/20/07, Martijn van Oosterhout <kleptog(at)svana(dot)org> wrote:
> > On Tue, Dec 18, 2007 at 02:53:16PM +0800, bookman bookman wrote:
>
> > > I know that every line of utf8 files is started with "fffe" or "feff"
> > > and ended with "\r\n" in windows but not in linux,so the character
> > > "1" has a space before it in the error line.
>
> > Err, no. In UTF-16 files it is common to begin the *file* with that
> > character, but UTF-8 doesn't have that character anywhere, it's
> > illegal. Just stripping them out should be fine.
>
> A BOM is perfectly legal in UTF-8, and it's commonly used as a
> signature to indicate the text is UTF-8 instead of another encoding.
> But yes, it is at the beginning of the file only.
>
> http://unicode.org/faq/utf_bom.html#29
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org/
>
From | Date | Subject | |
---|---|---|---|
Next Message | Louis-David Mitterrand | 2007-12-23 16:38:45 | Re: postgres UTC different from perl? |
Previous Message | Trevor Talbot | 2007-12-23 15:39:19 | Re: pgsql cannot read utf8 files moved from windows correctly! |