Joe Conway wrote:
> David Wheeler wrote:
> > My understanding is that the nul character is legal in a byte sequence,
> > but if it's not properly escaped, it'll be parsed as the end of the
> > statement. Unfortunately, I think that it's a very tough problem to solve.
>
> No question wrt '\0' bytes -- they would have to be escaped when casting from
> bytea to text.
>
> The harder issue is that there are apparently many other multiple byte
> sequences that, while valid in an ASCII encoding, are not valid in one or more
> multibyte encodings. See this thread:
>
> http://archives.postgresql.org/pgsql-hackers/2002-04/msg00236.php
>
> This is why currently all "non printable characters" are escaped (which I
> think is all bytes > 127). Text on the other hand is already known to be valid
> for a particular encoding, so it doesn't need escaping.
>
> I'm not sure what happens when the backend encoding and client encoding don't
> match -- I'd guess there is some probability of invalid byte sequences in that
> case too.
I think there is some idea of changing the frontend/backend protocol to
prevent the need for escaping > \127 characters. I believe it is
currently only required when the frontend/backend protocol have
different encodings.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073