From: | Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> |
---|---|
To: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
Cc: | Vladimir Sitnikov <sitnikov(dot)vladimir(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: speed up verifying UTF-8 |
Date: | 2021-07-19 05:23:22 |
Message-ID: | CAJ3gD9c=dhu3D2tYkPyZ-vEwt5RqUUgGJWfcQ7jFrVOiQW3SqQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 17 Jul 2021 at 04:48, John Naylor <john(dot)naylor(at)enterprisedb(dot)com> wrote:
> v17-0001 is the same as v14. 0002 is a stripped-down implementation of Amit's
> chunk idea for multibyte, and it's pretty good on x86. On Power8, not so
> much. 0003 and 0004 are shot-in-the-dark guesses to improve it on Power8,
> with some success, but end up making x86 weirdly slow, so I'm afraid that
> could happen on other platforms as well.
Thanks for trying the chunk approach. I tested your v17 versions on
Arm64. For the chinese characters, v17-0002 gave some improvement over
v14. But for all the other character sets, there was around 10%
degradation w.r.t. v14. I thought maybe the hhton64 call and memcpy()
for each mb character might be the culprit, so I tried iterating over
all the characters in the chunk within the same pg_utf8_verify_one()
function by left-shifting the bits. But that worsened the figures. So
I gave up that idea.
Here are the numbers on Arm64 :
HEAD:
chinese | mixed | ascii | mixed16 | mixed8
---------+-------+-------+---------+--------
1781 | 1095 | 628 | 944 | 1151
v14:
chinese | mixed | ascii | mixed16 | mixed8
---------+-------+-------+---------+--------
852 | 484 | 144 | 584 | 971
v17-0001+2:
chinese | mixed | ascii | mixed16 | mixed8
---------+-------+-------+---------+--------
731 | 520 | 152 | 645 | 1118
Haven't looked at your v18 patch set yet.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2021-07-19 05:24:39 | Re: logical replication empty transactions |
Previous Message | Amit Kapila | 2021-07-19 05:22:35 | Re: Skipping logical replication transactions on subscriber side |