From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [v9.2] make_greater_string() does not return a string in some cases |
Date: | 2011-10-18 03:45:05 |
Message-ID: | CA+TgmoYETjFMP2hFzWwCxEi2OQKA+NP5CY-DMPnasxNCgX+2rg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Wed, Oct 12, 2011 at 11:45 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)oss(dot)ntt(dot)co(dot)jp> wrote:
> Hello, the work is finished.
>
> Version 4 of the patch is attached to this message.
I went through this in a bit more detail tonight and am cleaning it
up. But I'm a bit confused, looking at pg_utf8_increment() in detail:
- Why does the second byte need special handling for 0xED and 0xF4?
AFAICT, UTF-8 requires all legal strings to have a second byte between
0x80 and 0xBF, just as in byte positions 3 and 4, so these bytes would
be invalid in this position anyway.
- In the first byte, we don't increment if the current value for that
byte is 0x7F, 0xDF, 0xEF, or 0xF4. But why isn't it 0xF7 rather than
0xF4? I see there's a comparable restriction in pg_utf8_islegal(),
but I don't understand why.
- Perhaps on the same point, the comments claim that we will fail for
code points U+0007F, U+007FF, U+0FFFF, and U+10FFFF. But IIUC, a
4-byte unicode character can encode values up to U+1FFFFF, so why is
it U+10FFFF rather than U+1FFFFF?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-10-18 03:54:45 | Re: [v9.2] make_greater_string() does not return a string in some cases |
Previous Message | Craig Ringer | 2011-10-18 00:48:36 | Re: BUG #6255: Unable to Install (Binary, One Click Installer |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2011-10-18 03:54:45 | Re: [v9.2] make_greater_string() does not return a string in some cases |
Previous Message | Peter Eisentraut | 2011-10-18 03:41:35 | Re: BUG or strange behaviour of update on primary key |