From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
Subject: | Re: speed up unicode decomposition and recomposition |
Date: | 2020-10-21 22:45:44 |
Message-ID: | CAFBsxsFwu0w4TNK+UnCfnzsi2nGKkG3LBKXCPCjDJBXg6iZpjw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Attached v3 addressing review points below:
On Thu, Oct 15, 2020 at 11:32 PM Michael Paquier <michael(at)paquier(dot)xyz>
wrote:
> + # Then the second
> + return -1 if $a2 < $b2;
> + return 1 if $a2 > $b2;
> Should say "second code point" here?
>
Done. Also changed the tiebreaker to the composed codepoint. Beforehand, it
was the index into DecompMain[], which is only equivalent if the list is in
order (it is but we don't want correctness to depend on that), and not very
clear.
> + hashkey = pg_hton64(((uint64) start << 32) | (uint64) code);
> + h = recompinfo.hash(&hashkey);
> This choice should be documented, and most likely we should have
> comments on the perl and C sides to keep track of the relationship
> between the two.
>
Done.
> <separate headers>
Done.
Other cosmetic changes:
- format recomp array comments like /* U+0045+032D -> U+1E18 */
- make sure to comment #endif's that are far from the #if
- small whitespace fixes
Note: for the new header I simply adapted from unicode_norm_table.h the
choice of "There is deliberately not an #ifndef PG_UNICODE_NORM_TABLE_H
here", although I must confess I'm not sure what the purpose of that is, in
this case.
--
John Naylor
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Speed-up-unicode-decomposition.patch | application/octet-stream | 107.2 KB |
v3-0002-Speed-up-unicode-recomposition.patch | application/octet-stream | 54.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-10-21 22:59:04 | Mop-up around psql's \connect behavior |
Previous Message | David G. Johnston | 2020-10-21 22:32:19 | Re: [DOC] Document concurrent index builds waiting on each other |