| From: | Andreas Karlsson <andreas(at)proxel(dot)se> |
|---|---|
| To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Unicode normalization SQL functions |
| Date: | 2020-02-13 00:23:41 |
| Message-ID: | 26150b35-240f-941c-e5a7-24f2d489b316@proxel.se |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 1/28/20 9:21 PM, Peter Eisentraut wrote:
> You're right, this didn't make any sense. Here is a new patch set with
> that fixed.
Thanks for this patch. This is a feature which has been on my personal
todo list for a while and something which I have wished to have a couple
of times.
I took a quick look at the patch and here is some feedback:
A possible concern is increased binary size from the new tables for the
quickcheck but personally I think they are worth it.
A potential optimization would be to merge utf8_to_unicode() and
pg_utf_mblen() into one function in unicode_normalize_func() since
utf8_to_unicode() already knows length of the character. Probably not
worth it though.
It feels a bit wasteful to measure output_size in
unicode_is_normalized() since unicode_normalize() actually already knows
the length of the buffer, it just does not return it.
A potential optimization for the normalized case would be to abort the
quick check on the first maybe and normalize from that point on only. If
I can find the time I might try this out and benchmark it.
Nitpick: "split/\s*;\s*/, $line" in generate-unicode_normprops_table.pl
should be "split /\s*;\s*/, $line".
What about using else if in the code below for clarity?
+ if (check == UNICODE_NORM_QC_NO)
+ return UNICODE_NORM_QC_NO;
+ if (check == UNICODE_NORM_QC_MAYBE)
+ result = UNICODE_NORM_QC_MAYBE;
Remove extra space in the line below.
+ else if (quickcheck == UNICODE_NORM_QC_NO )
Andreas
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2020-02-13 01:25:39 | Re: [PATCH] libpq improvements and fixes |
| Previous Message | Ranier Vilela | 2020-02-12 22:55:32 | [PATCH] libpq improvements and fixes |