Guillaume Cottenceau <gc(at)mnc(dot)ch> writes:
> My reasoning was that if the first byte of this two-byte
> sequence is 0x00 then the rule that 0xEF is first byte of a
> three-byte sequence doesn't apply, since 0xEF is second byte in
> the sequence.
Looking at the source code, it's clear that it's reporting just the
first byte of the sequence; the 00 is redundant and probably shouldn't
be in the message.
There seem to be two possibilities: either there is a valid 3-byte
UTF8 character, which cannot be converted to LATIN1; or the alleged
UTF8 data isn't really UTF8 at all.
regards, tom lane