| From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
|---|---|
| To: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
| Cc: | alvherre(at)commandprompt(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql(at)markdilger(dot)com, all(at)adv(dot)magwien(dot)gv(dot)at, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Bug in UTF8-Validation Code? |
| Date: | 2007-04-05 11:58:48 |
| Message-ID: | 20070405115848.GB17587@svana.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Apr 05, 2007 at 09:34:25AM +0900, Tatsuo Ishii wrote:
> I'm not sure what kind of use case for unicode_char() you are thinking
> about. Anyway if you want a "code point" from a character, we could
> easily add such functions to all backend encodings currently we
> support. Probably it would look like:
I think the problem is that most encodings do not have the concept of a
code point anyway, so implementing it for them is fairly useless.
> An example outputs are:
>
> ASCII - 41
> ISO 10646 - U+0041
> ISO 10646 - U+29E3D
> ISO 8859-1 - a5
> JIS X 0208 - 4141
In every case other than Unicode you're doing the same thing as
encode/decode. Since we already have those functions, there's no need
to get chr/ascii to duplicate it. In the case of UTF-8 however, it does
something that is not done by encode/decode, hence the proposal to
simply extend chr/ascii to do that.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Magnus Hagander | 2007-04-05 12:03:35 | Buildfarm failures en masse |
| Previous Message | Simon Riggs | 2007-04-05 11:28:16 | Re: Auto Partitioning |