From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Joseph Adams <joeyadams3(dot)14159(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: patch: utf8_to_unicode (trivial) |
Date: | 2010-08-13 18:11:27 |
Message-ID: | AANLkTimw2HhW3z8GL2WJzOAHYWN4KKoxvKgO2Kk-QEUN@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 13, 2010 at 1:50 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
>> Excerpts from Robert Haas's message of vie ago 13 12:50:13 -0400 2010:
>>> Oh, hey, look at that. Any thought on what to about the fact that our
>>> two existing copies of utf2ucs() don't match? (one tests against 0xf8
>>> where the other against 0xf0)
>
>> I'm not sure why it's masking 0xf8 instead of 0xf0.
>
> Because it wants to verify that this is in fact a 4-byte UTF8 code.
> Compare the code (and comments) for pg_utf_mblen.
>
> AFAICS the version in mbprint.c is flat out wrong, and the only reason
> nobody's noticed is that it should never get passed a more-than-4-byte
> sequence anyway.
Should we fix it, then, and if so how far should we go back?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2010-08-13 18:21:58 | Re: patch: utf8_to_unicode (trivial) |
Previous Message | David Fetter | 2010-08-13 18:02:43 | Re: patch: General purpose utility functions used by the JSON data type |