Re: Using psql -f to load a UTF8 file

From: Roger Leigh <rleigh(at)codelibre(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Using psql -f to load a UTF8 file
Date: 2012-09-21 09:40:45
Message-ID: 20120921094044.GA18133@codelibre.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Sep 21, 2012 at 09:21:36AM +0800, Craig Ringer wrote:
> On 09/20/2012 11:44 PM, Leif Biberg Kristensen wrote:
> > Torsdag 20. september 2012 16.56.16 skrev Alan Millington :
> >>psql". But how am I supposed to remove the byte order mark from a UTF8
> >>file? I thought that the whole point of the byte order mark was to tell
> >>programs what the file encoding is. Other programs, such as Python, rely
> >>on this.
> >
> >http://en.wikipedia.org/wiki/Byte_order_mark
> >
> >While the Byte Order Mark is important for UTF-16, it's totally irrelevant to
> >the UTF-8 encoding.
>
> I strongly disagree. The BOM provides a useful and standard way to
> differentiate UTF-8 encoded text files from the random pile of
> encodings that any given file could be.

Use of the BOM in UTF-8 causes a host of display and interoperability
problems, and is considered by many to be a broken practice. It's
also pointless since there are no byte ordering issues with UTF-8.
Best to not use it at all. In any case, the BOM byte sequence does
not unambiguously identify UTF-8; it's equally valid for 8-bit
charsets, so an external means of specifying the encoding is
preferable and more robust.

Regards,
Roger

--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' schroot and sbuild http://alioth.debian.org/projects/buildd-tools
`- GPG Public Key F33D 281D 470A B443 6756 147C 07B3 C8BC 4083 E800

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Albe Laurenz 2012-09-21 09:54:27 Re: Why csvlog logs contexts without leading tab?
Previous Message Benedikt Grundmann 2012-09-21 09:18:48 Re: Expression to construct a anonymous record with named columns?