Quick Links

Re: UTF8 with BOM support in psql

From:	Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: UTF8 with BOM support in psql
Date:	2009-11-17 00:31:51
Message-ID:	20091117093151.14F2.52131E4D@oss.ntt.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:

> OK, I think the consensus here is:
> - Eat BOM at beginning of file (as you implemented)
> - Only when client encoding is UTF-8 --> please fix that

Are they AND condition? If so, this patch will be useless.
Please remember \encoding or SET client_encoding appear
*after* BOM at beginning of file. I'll agree if the condition is
"Eat BOM at beginning of file and <<set client encoding to UTF-8>>",
like:
Defining Python Source Code Encodings:
http://www.python.org/dev/peps/pep-0263/

> I'm not sure if replacing a BOM by three spaces is a good way to
> implement "eating", because it might throw off a column indicator
> somewhere, say, but I couldn't reproduce a problem. Note that the U
> +FEFF character is defined as *zero-width* non-breaking space.

I assumed psql discards whitespaces automatically, but I see it is
more robust to remove BOM bytes explitly. I'll fix it.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center

In response to

Re: UTF8 with BOM support in psql at 2009-11-16 20:37:07 from Peter Eisentraut

Responses

Re: UTF8 with BOM support in psql at 2009-11-17 00:37:35 from Tom Lane
Re: UTF8 with BOM support in psql at 2009-11-17 00:51:26 from Andrew Dunstan
Re: UTF8 with BOM support in psql at 2009-11-17 17:03:10 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jan Urbański	2009-11-17 00:36:59	Re: Partitioning option for COPY
Previous Message	Tatsuo Ishii	2009-11-17 00:25:40	Re: Summary and Plan for Hot Standby