From: | Paul Ramsey <pramsey(at)refractions(dot)net> |
---|---|
To: | PostgreSQL <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: 8.0, UTF8, and CLIENT_ENCODING |
Date: | 2007-05-17 23:55:51 |
Message-ID: | D84BEF92-179D-4197-A686-FA80DA8B7961@refractions.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks all for the information. Summary is:
- 8.0 wasn't very strict, and allowed the illegal values in, instead
of mapping them over into UTF-8 space
- the values can be stripped with iconv -c
- 8.2 should be more strict
I'm in the midst of my upgrade to 8.2 now, hopefully the LATIN1->UTF8
conversion will now map the odd characters cleanly into UTF space.
On 17-May-07, at 3:25 PM, Michael Glaesemann wrote:
>
> On May 17, 2007, at 16:47 , PFC wrote:
>
>>> and put that in the form. Instead of being mapped to 2-byte UTF8
>>> high-bit equivalents, they are going into the database directly
>>> as one-byte values > 127. That is, as illegal UTF8 values.
>>
>> Sometimes you also get HTML entities in the mix. Who knows.
>> All my web forms are UTF-8 back to back, it just works. Was I
>> lucky ?
>> Normally postgres rejects illegal UTF8 values, you wouldn't be
>> able to insert them...
>
> 8.0 and earlier weren't quite as strict as it should have been. See
> the note at the end of the migration instuctions in the release
> notes for 8.1[1] That may have been part of the issue here.
>
> Michael Glaesemann
> grzm seespotcode net
>
> [1](http://www.postgresql.org/docs/8.2/interactive/
> release-8-1.html#AEN80196)
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Nolan | 2007-05-18 00:02:56 | Re: Large Database Restore |
Previous Message | George Pavlov | 2007-05-17 23:45:30 | Re: Privs on deleted objects |