From: | Andreas Karlsson <andreas(at)proxel(dot)se> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Unicode normalization SQL functions |
Date: | 2020-02-13 00:23:41 |
Message-ID: | 26150b35-240f-941c-e5a7-24f2d489b316@proxel.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/28/20 9:21 PM, Peter Eisentraut wrote:
> You're right, this didn't make any sense. Here is a new patch set with
> that fixed.
Thanks for this patch. This is a feature which has been on my personal
todo list for a while and something which I have wished to have a couple
of times.
I took a quick look at the patch and here is some feedback:
A possible concern is increased binary size from the new tables for the
quickcheck but personally I think they are worth it.
A potential optimization would be to merge utf8_to_unicode() and
pg_utf_mblen() into one function in unicode_normalize_func() since
utf8_to_unicode() already knows length of the character. Probably not
worth it though.
It feels a bit wasteful to measure output_size in
unicode_is_normalized() since unicode_normalize() actually already knows
the length of the buffer, it just does not return it.
A potential optimization for the normalized case would be to abort the
quick check on the first maybe and normalize from that point on only. If
I can find the time I might try this out and benchmark it.
Nitpick: "split/\s*;\s*/, $line" in generate-unicode_normprops_table.pl
should be "split /\s*;\s*/, $line".
What about using else if in the code below for clarity?
+ if (check == UNICODE_NORM_QC_NO)
+ return UNICODE_NORM_QC_NO;
+ if (check == UNICODE_NORM_QC_MAYBE)
+ result = UNICODE_NORM_QC_MAYBE;
Remove extra space in the line below.
+ else if (quickcheck == UNICODE_NORM_QC_NO )
Andreas
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-02-13 01:25:39 | Re: [PATCH] libpq improvements and fixes |
Previous Message | Ranier Vilela | 2020-02-12 22:55:32 | [PATCH] libpq improvements and fixes |