From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Unicode normalization SQL functions |
Date: | 2020-01-28 20:21:18 |
Message-ID: | 43f13518-010a-8319-8013-f319522ea719@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2020-01-28 10:48, Daniel Verite wrote:
> I found a bug in unicode_is_normalized_quickcheck() which is
> triggered when the last codepoint of the string is beyond
> U+10000. On encountering it, it does:
> + if (is_supplementary_codepoint(ch))
> + p++;
> When ch is the last codepoint, it makes p point to
> the ending zero, but the subsequent p++ done by
> the for loop makes it miss the exit and go into over-reading.
>
> But anyway, what's the reason for skipping the codepoint
> following a codepoint outside of the BMP?
You're right, this didn't make any sense. Here is a new patch set with
that fixed.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Add-support-for-other-normal-forms-to-Unicode-nor.patch | text/plain | 370.0 KB |
v3-0002-Add-SQL-functions-for-Unicode-normalization.patch | text/plain | 1.1 MB |
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2020-01-28 20:29:18 | Re: Removing pg_pltemplate and creating "trustable" extensions |
Previous Message | Robert Haas | 2020-01-28 20:08:39 | Re: making the backend's json parser work in frontend code |