From: | Jeff Davis <jdavis(at)laika(dot)com> |
---|---|
To: | Albe Laurenz <all(at)adv(dot)magwien(dot)gv(dot)at> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: strange encoding behavior |
Date: | 2006-10-23 16:16:18 |
Message-ID: | 1161620178.18892.6.camel@dogma.v10.wvs |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 2006-10-23 at 10:26 +0200, Albe Laurenz wrote:
> Jeff Davis wrote:
> > I have a UTF8 encoded database. I can do
> >
> > => SELECT '\xb9'::text;
> >
> > But that seems to be the only way to get an invalid utf8 byte sequence
> > into a text type.
> [...]
> > So, if I were to sum this up in a single question, why does cstring
> not
> > accept invalid utf8 sequences? And if it doesn't, why are they allowed
> > in any text type?
>
> I would say that it should be impossible to get invalid UTF-8 bytes
> into a text on an UTF-8 database, and my opinion is that it is a bug or
> oversight if a typecast allows you to do so.
That wouldn't help me, but it seems like more consistent behavior.
> The program you are talking about that needs to be able to store
> arbitrary bytes in a text column should be changed - maybe it is enough
> to change the data type of the database column from 'text' to 'bytea'.
>
The problem is that all the bytes in the quoted string are converted to
a cstring first, which rejects invalid UTF8 sequences. So, even if it's
bytea type, the query itself can't contain the bytes I want to store.
The only way bytea would work is using PQexecParams and setting the type
to bytea and the format to binary. I agree that's the more robust way
for the application to be written, but unfortunately that's not how it
was written.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | dfx | 2006-10-23 17:37:23 | Any documentation about cayenne |
Previous Message | Merlin Moncure | 2006-10-23 15:25:09 | Re: performace review |