From: | "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> |
---|---|
To: | "Greg Stark" <stark(at)enterprisedb(dot)com>, "Peter Eisentraut" <peter_e(at)gmx(dot)net> |
Cc: | "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "- -" <crossroads0000(at)googlemail(dot)com>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Unicode support |
Date: | 2009-04-14 17:12:11 |
Message-ID: | 49E47D9B.EE98.0025.0@wicourts.gov |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greg Stark <stark(at)enterprisedb(dot)com> wrote:
> Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>> SELECT U&'\00E9', char_length(U&'\00E9');
>> ?column? | char_length
>> ----------+-------------
>> é | 1
>> (1 row)
>>
>> SELECT U&'\0065\0301', char_length(U&'\0065\0301');
>> ?column? | char_length
>> ----------+-------------
>> é | 2
>> (1 row)
>
> What's really at issue is "what is a string?". That is, it a
> sequence of characters or a sequence of code points.
Doesn't the SQL standard refer to them as "character string literals"?
The function is called character_length or char_length.
I'm curious -- can every multi-code-point character be normalized to a
single-code-point character?
-Kevin
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2009-04-14 17:27:04 | Re: Unicode support |
Previous Message | David Fetter | 2009-04-14 16:51:54 | Re: psql with "Function Type" in \df |