From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
Cc: | tgl(at)sss(dot)pgh(dot)pa(dot)us, laurenz(dot)albe(at)wien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: invalidly encoded strings |
Date: | 2007-09-11 02:54:20 |
Message-ID: | 46E6035C.8010708@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tatsuo Ishii wrote:
>> BTW, it strikes me that there is another hole that we need to plug in
>> this area, and that's the convert() function. Being able to create
>> a value of type text that is not in the database encoding is simply
>> broken. Perhaps we could make it work on bytea instead (providing
>> a cast from text to bytea but not vice versa), or maybe we should just
>> forbid the whole thing if the database encoding isn't SQL_ASCII.
>>
>
> Please don't do that. It will break an usefull use case of convert().
>
> A user has a database encoded in UTF-8. He has English, French,
> Chinese and Japanese data in tables. To sort the tables in the
> language order, he will do like this:
>
> SELECT * FROM japanese_table ORDER BY convert(japanese_text using utf8_to_euc_jp);
>
> Without using convert(), he will get random order of data. This is
> because Kanji characters are in random order in UTF-8, while Kanji
> characters are reasonably ordered in EUC_JP.
>
>
Tatsuo-san,
would not this case be at least as well met by an operator supplying the
required ordering? The operator of course would not have the danger of
supplying values that are invalid in the database encoding. Admittedly,
the user might need several operators for the case you describe.
I'm not sure we are going to be able to catch every path by which
invalid data can get into the database in one release. I suspect we
might need two or three goes at this. (I'm just wondering if the
routines that return cstrings are a possible vector).
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2007-09-11 02:55:57 | Re: invalidly encoded strings |
Previous Message | Tatsuo Ishii | 2007-09-11 02:53:06 | Re: invalidly encoded strings |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2007-09-11 02:55:57 | Re: invalidly encoded strings |
Previous Message | Tatsuo Ishii | 2007-09-11 02:53:06 | Re: invalidly encoded strings |