Re: psql blows up on BOM character sequence

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: psql blows up on BOM character sequence
Date: 2014-03-24 18:59:50
Message-ID: 533080A6.3000408@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 03/24/2014 02:50 PM, Jim Nasby wrote:
> On 3/22/14, 11:26 AM, Jim Nasby wrote:
>> On 3/21/14, 4:54 PM, Tom Lane wrote:
>>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>>> There is no way for psql to handle that case though unless you'd strip
>>>> *all* BOMs encountered. Compounding this problem is that there's no
>>>> practical way AFAIK to send multiple file to psql via single command
>>>> line invocation. If you pass multiple -f arguments all but one is
>>>> ignored.
>>>
>>> Well, that seems like a solvable but rather independent problem.
>>> I guess one issue is how you'd define the meaning of --single ...
>>> one transaction per run, or one per file?
>>
>> Well, if you're catting multiple files into psql -1, you'd get all
>> the files in one transaction, right? So I'd say that's what should
>> happen.
>
> It occurs to me that we're going about this the wrong way...
>
> The error here isn't being generated by psql; it's generated by the
> backend. In the context of a statement (and not, say, a COPY command).
>
> So instead of trying to handle this on the psql side[1], I think we
> need to handle it in the backend; specifically in the parser. Is there
> an easy way to get the parser to ignore the BOM character in the
> context of commands (but not in strings)?
>
> [1]: Obviously, BOM could still screw up a psql command like \d. We'd
> want to address that as well; but I suspect backends are the more
> common scenario.

But what about COPY files? I don't see why there is less of a case for
eating a leading BOM for a COPY file (or COPY stdin for that matter,
given that it can come from \copy) than for an SQL file.

I suspect suspect trying to do this in the parser will be quite messy.
This needs to happen before the input is converted to the server
encoding, I think.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-03-24 19:16:14 Re: psql blows up on BOM character sequence
Previous Message Jim Nasby 2014-03-24 18:50:21 Re: psql blows up on BOM character sequence