From: | Andreas Kalsch <andreaskalsch(at)gmx(dot)de> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: character 0xe29986 of encoding "UTF8" has no equivalent in "LATIN2" |
Date: | 2009-08-04 13:02:28 |
Message-ID: | 4A783164.9040804@gmx.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Alban,
what I do to simplify the data chain:
HTTP encoding > PHP string encoding > client connection > server - all
is UTF8. Plus invalid byte check in PHP (or server).
What I have tested inside Postgres is entering a 3 byte UTF8 character
to this function. And I have got an error. This is a character I will
not filter out, if some Unicode artists will enter it. It is an
international website and the simplification is just for indexing.
But I think that this will not solve the problem and I have to use
Python or Perl to get it done.
Alban Hertroys schrieb:
> On 4 Aug 2009, at 24:57, Andreas Kalsch wrote:
>
>>> I think the real problem is: Where do you lose the original encoding
>>> the users input their data with? If you specify that encoding on the
>>> connection and send it to a database that can handle UTF-8 then you
>>> shouldn't be getting any conversion problems in the first place.
>> Nowhere - I will validate input data on the client side (PHP or
>> Python) and send it to the server. Of course the only encoding I will
>> use on any side is UTF8. I just wnated to use this Latin thing for
>> simplification of characters.
>
> Yes you are. How could your users input invalid characters in the
> first place if that were not the case? You're not suggesting they
> managed to enter characters in an encoding for which they weren't
> valid on their own systems, do you?[1]
>
> You say your client is using PHP or Python, which suggests it's a
> website. That means the input goes like this: web browser -> website
> -> database. All three of those steps use some encoding and you can
> take them into account. That should prevent this problem altogether.
>
> You have control over which encoding your client and the database use,
> and the web browser tells what encoding it used in the POST request so
> you can pass that along to the database when storing data or convert
> it in your client.
>
> [1] There exists of course a small group of people who enjoy posting
> raw byte data to a web-form, but would it matter whether they'd get an
> error about their encoding or not? They do not intend to enter valid
> data after all ;)
>
> Alban Hertroys
>
> --
> If you can't see the forest for the trees,
> cut the trees and you'll see there is no forest.
>
>
> !DSPAM:933,4a7820e310131447310801!
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Sam Mason | 2009-08-04 13:12:58 | Re: parameters in functions and overlap with names of columns |
Previous Message | Harald Fuchs | 2009-08-04 12:51:33 | Re: Refer to another database |