| From: | Andrew - Supernews <andrew+nonews(at)supernews(dot)com> |
|---|---|
| To: | pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Bug in UTF8-Validation Code? |
| Date: | 2007-04-05 01:35:19 |
| Message-ID: | slrnf18kin.2i67.andrew+nonews@atlantis.supernews.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2007-04-05, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:
>> Andrew - Supernews <andrew+nonews(at)supernews(dot)com> writes:
>> > Thinking about this made me realize that there's another, ahem, elephant
>> > in the room here: convert().
>> > By definition convert() returns text strings which are not valid in the
>> > server encoding. How can this be addressed?
>>
>> Remove convert(). Or at least redefine it as dealing in bytea not text.
>
> That would break some important use cases.
>
> 1) A user have UTF-8 database which contains various language
> data. Each language has its own table. He wants to sort a SELECT
> result by using ORDER BY. Since locale cannot handle multiple
> languages, he uses C locale and do the SELECT something like this:
>
> SELECT * FROM french_table ORDER BY convert(t, 'LATIN1');
> SELECT * FROM japanese_table ORDER BY convert(t, 'EUC_JP');
That works without change if convert(text,text) returns bytea.
>
> 2) A user has a UTF-8 database but unfortunately his OS's UTF-8 locale
> is broken. He decided to use C locale and want to sort the result
> from SELECT like this.
>
> SELECT * FROM japanese_table ORDER BY convert(t, 'EUC_JP');
That also works without change if convert(text,text) returns bytea.
--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2007-04-05 01:47:28 | Re: [HACKERS] --enable-xml instead of --with-libxml? |
| Previous Message | Gregory Stark | 2007-04-05 01:34:31 | Re: Modifying TOAST thresholds |