From: | "Joel Jacobson" <joel(at)compiler(dot)org> |
---|---|
To: | "Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. |
Date: | 2024-07-04 07:38:44 |
Message-ID: | 407a0741-659f-4767-a534-7afbb90e343e@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
> Anyway, here are both patches for comparison. I'll stop hacking for a
> while and let you see what you make of these.
>
> Regards,
> Dean
>
> Attachments:
> * v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> * v5-add-mul_var_int.patch
I've now benchmarked the patches on all my machines,
see bench_mul_var.sql for details.
Summary of benchmark results:
cpu | var1ndigits | winner
----------------------+-------------+-------------------------------------------------------------
AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Apple M3 Max | 1 | v5-add-mul_var_int.patch
Apple M3 Max | 2 | v5-add-mul_var_int.patch
Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch
Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch
Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
(9 rows)
Performance ratio against HEAD per CPU and var1ndigits:
cpu | var1ndigits | version | performance_ratio
----------------------+-------------+-------------------------------------------------------------+-------------------
AMD Ryzen 9 7950X3D | 1 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.11
AMD Ryzen 9 7950X3D | 1 | v5-add-mul_var_int.patch | 1.07
AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.12
AMD Ryzen 9 7950X3D | 2 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
AMD Ryzen 9 7950X3D | 2 | v5-add-mul_var_int.patch | 1.11
AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.13
AMD Ryzen 9 7950X3D | 3 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
AMD Ryzen 9 7950X3D | 3 | v5-add-mul_var_int.patch | 0.98
AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.15
Apple M3 Max | 1 | HEAD | 1.00
Apple M3 Max | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07
Apple M3 Max | 1 | v5-add-mul_var_int.patch | 1.08
Apple M3 Max | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07
Apple M3 Max | 2 | HEAD | 1.00
Apple M3 Max | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Apple M3 Max | 2 | v5-add-mul_var_int.patch | 1.21
Apple M3 Max | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
Apple M3 Max | 3 | HEAD | 1.00
Apple M3 Max | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Apple M3 Max | 3 | v5-add-mul_var_int.patch | 0.99
Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Intel Core i9-14900K | 1 | HEAD | 1.00
Intel Core i9-14900K | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.05
Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch | 1.07
Intel Core i9-14900K | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 2 | HEAD | 1.00
Intel Core i9-14900K | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch | 1.08
Intel Core i9-14900K | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 3 | HEAD | 1.00
Intel Core i9-14900K | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04
Intel Core i9-14900K | 3 | v5-add-mul_var_int.patch | 1.00
Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04
(36 rows)
The queries to produce the above are in bench_csv_queries.txt
/Joel
Attachment | Content-Type | Size |
---|---|---|
bench_csv_queries.txt | text/plain | 1016 bytes |
bench.csv | text/csv | 11.9 KB |
bench_mul_var.sql | application/octet-stream | 9.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2024-07-04 07:44:30 | Re: Logical Replication of sequences |
Previous Message | Alexander Kukushkin | 2024-07-04 07:15:46 | Non-superuser can't relocated its own trusted extensions |