Quick Links

Re: UTF8 with BOM support in psql

From:	Peter Eisentraut <peter_e(at)gmx(dot)net>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: UTF8 with BOM support in psql
Date:	2009-11-18 09:18:22
Message-ID:	1258535902.3497.6.camel@fsopti579.F-Secure.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On tis, 2009-11-17 at 23:22 -0500, Andrew Dunstan wrote:
> Itagaki Takahiro wrote:
> > I don't want user to check the encoding of scripts before executing
> --
> > it is far from fail-safe.
> >
> >
> >
>
> That's what we require in all other cases. Why should UTF8 be special?

But now we're back to the original problem. Certain editors insert BOMs
at the beginning of the file. And that is by any definition before the
embedded client encoding declaration. I think the only ways to solve
this are:

1) Ignore the BOM if a client encoding declaration of UTF8 appears in a
narrowly defined location near the beginning of the file (XML and
PEP-0263 style). For *example*, we could ignore the BOM if the file
starts with exactly "<BOM>\encoding UTF8\n". Would probably not work
well in practice.

2) Parse two alternative versions of the file, one with the BOM ignored
and one with the BOM not ignored, until you need to make a decision.
Hilariously complicated, but would perhaps solve the problem.

3) Give up, do nothing.

In response to

Re: UTF8 with BOM support in psql at 2009-11-18 04:22:34 from Andrew Dunstan

Responses

Re: UTF8 with BOM support in psql at 2009-11-18 13:52:20 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2009-11-18 09:25:04	Re: RFC for adding typmods to functions
Previous Message	Peter Eisentraut	2009-11-18 09:11:59	Re: UTF8 with BOM support in psql