From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Joel Jacobson <joel(at)compiler(dot)org> |
Cc: | Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. |
Date: | 2024-07-05 15:41:33 |
Message-ID: | CAEZATCWY_h7jTzsQnZY3bChNF85W4KLYV-rRgy=cc4QAUXEUdg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 5 Jul 2024 at 12:56, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> Interesting you got so bad bench results for v6-mul_var_int64.patch
> for var1ndigits=4, that patch is actually the winner on AMD Ryzen 9 7950X3D.
Interesting.
> On Intel Core i9-14900K the winner is v6-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch.
That must be random noise, since
v6-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch doesn't
invoke mul_var_small() for 4-digit inputs.
> On Apple M3 Max, HEAD is the winner.
Importantly, mul_var_int64() is around 1.25x slower there, and it was
even worse on my machine.
Attached is a v7 mul_var_small() patch adding 4-digit support. For me,
this gives a nice speedup:
SELECT SUM(var1*var2) FROM bench_mul_var_var1ndigits_4;
Time: 5617.150 ms (00:05.617) -- HEAD
Time: 8203.081 ms (00:08.203) -- v6-mul_var_int64.patch
Time: 4750.212 ms (00:04.750) -- v7-mul_var_small.patch
The other advantage, of course, is that it doesn't require 128-bit
integer support.
Regards,
Dean
Attachment | Content-Type | Size |
---|---|---|
v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | text/x-patch | 8.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-07-05 15:56:55 | Re: ECPG cleanup and fix for clang compile-time problem |
Previous Message | feichanghong | 2024-07-05 15:19:22 | Optimize commit performance with a large number of 'on commit delete rows' temp tables |