From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | James Cloos <cloos(at)jhcloos(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: pg should ignore u+200b zero width space |
Date: | 2020-11-03 14:52:47 |
Message-ID: | 919181.1604415167@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> On 03/11/2020 15:41, James Cloos wrote:
>> pg should treat a no break space after whitespace as just more
>> whitespace.
> Hmm. I'm not sure if change the behavior is a good idea, but a hint in
> the error message would be nice. Something like:
The difficulty with doing anything in this space --- whether it be
ignoring, throwing an error, or whatever --- is that it makes the
lexer's behavior encoding-sensitive and potentially locale-sensitive.
That's problematic for all sorts of reasons. One of the worst is
that frontend programs such as psql and ecpg also have SQL lexers,
and there'd be no way to keep their behavior in precise sync with
the backend's. (They might not even be running in the same encoding,
never mind locale.) It might even be possible to build security
holes around that; recall the fun we've had with trying to lock
down quoting rules in encodings where backslash can be part of a
multibyte character :-(.
Perhaps it'd be all right to confine the change in behavior to
just modifying the error text in cases where we were going to
throw an error anyway. But I think this is much harder than
it sounds to do in a valid, safe way.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2020-11-03 15:13:41 | Re: pg should ignore u+200b zero width space |
Previous Message | Tom Lane | 2020-11-03 14:36:09 | Re: BUG #16698: Create extension and search path |