| From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
|---|---|
| To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Unicode normalization SQL functions |
| Date: | 2020-01-28 20:21:18 |
| Message-ID: | 43f13518-010a-8319-8013-f319522ea719@2ndquadrant.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 2020-01-28 10:48, Daniel Verite wrote:
> I found a bug in unicode_is_normalized_quickcheck() which is
> triggered when the last codepoint of the string is beyond
> U+10000. On encountering it, it does:
> + if (is_supplementary_codepoint(ch))
> + p++;
> When ch is the last codepoint, it makes p point to
> the ending zero, but the subsequent p++ done by
> the for loop makes it miss the exit and go into over-reading.
>
> But anyway, what's the reason for skipping the codepoint
> following a codepoint outside of the BMP?
You're right, this didn't make any sense. Here is a new patch set with
that fixed.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
| Attachment | Content-Type | Size |
|---|---|---|
| v3-0001-Add-support-for-other-normal-forms-to-Unicode-nor.patch | text/plain | 370.0 KB |
| v3-0002-Add-SQL-functions-for-Unicode-normalization.patch | text/plain | 1.1 MB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Stephen Frost | 2020-01-28 20:29:18 | Re: Removing pg_pltemplate and creating "trustable" extensions |
| Previous Message | Robert Haas | 2020-01-28 20:08:39 | Re: making the backend's json parser work in frontend code |