From: | "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | "Andrew Dunstan" <andrew(at)dunslane(dot)net> |
Cc: | "PostgreSQL-development Hackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: proposal: UTF8 to_ascii function |
Date: | 2008-08-11 13:00:27 |
Message-ID: | 162867790808110600l565f8463yae0b26bdf7a97bd@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2008/8/11 Andrew Dunstan <andrew(at)dunslane(dot)net>:
>
>
> Pavel Stehule wrote:
>>
>> Hello,
>>
>> combination functions to_ascii and convert_to is broken now. Problem
>> is in convert_to function. It doesn't support 8bit output encoding.
>>
>> Current workaround:
>>
>> CREATE FUNCTION to_ascii(bytea, name)
>> RETURNS text AS 'to_ascii_encname' LANGUAGE internal;
>>
>> SELECT to_ascii(convert_to('Příliš žlutý kůň', 'latin2'),'latin2');
>>
>> I don't expect column collate for 8.4, so we need to have workable
>> to_ascii function.
>>
>> I propose function to_ascii(text, name) that internally convert text
>> from utf8 encoding when it's necessary.cheers
>>
>>
>>
>
> convert_to is not broken. It returns a bytea, and it is up to you to
> de-escape it if you get the text representation.
One note - convert_to is correct. But we have to use to_ascii without
decode functions. It has same behave - convert from bytea to text.
Text in "incorrect" encoding is dafacto bytea. So correct to_ascii
function prototypes are:
to_ascii(text)
to_ascii(bytea, integer);
to_ascii(bytea, name);
Regards
Pavel Stehule
>
> We are surely not going to go back to a situation where we have functions
> returning text in any encoding other than the database encoding. That
> becomes a vehicle for storing wrongly encoded data in the database, and we
> have just gone through the exercise of plugging those holes. I privately
> predicted when we did this work that it might motivate people who had been
> abusing convert_to to get proper support for multiple encodings done. That
> is the right way to go, not re-opening holes we have just very deliberately
> plugged.
>
>
>
> cheers
>
> andrew
>
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2008-08-11 13:17:28 | Re: proposal: UTF8 to_ascii function |
Previous Message | Heikki Linnakangas | 2008-08-11 12:50:15 | Re: Proposal: PageLayout footprint |