Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date: 2024-07-04 07:38:44
Message-ID: 407a0741-659f-4767-a534-7afbb90e343e@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
> Anyway, here are both patches for comparison. I'll stop hacking for a
> while and let you see what you make of these.
>
> Regards,
> Dean
>
> Attachments:
> * v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> * v5-add-mul_var_int.patch

I've now benchmarked the patches on all my machines,
see bench_mul_var.sql for details.

Summary of benchmark results:

cpu | var1ndigits | winner
----------------------+-------------+-------------------------------------------------------------
AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Apple M3 Max | 1 | v5-add-mul_var_int.patch
Apple M3 Max | 2 | v5-add-mul_var_int.patch
Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch
Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch
Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
(9 rows)

Performance ratio against HEAD per CPU and var1ndigits:

cpu | var1ndigits | version | performance_ratio
----------------------+-------------+-------------------------------------------------------------+-------------------
AMD Ryzen 9 7950X3D | 1 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.11
AMD Ryzen 9 7950X3D | 1 | v5-add-mul_var_int.patch | 1.07
AMD Ryzen 9 7950X3D | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.12
AMD Ryzen 9 7950X3D | 2 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
AMD Ryzen 9 7950X3D | 2 | v5-add-mul_var_int.patch | 1.11
AMD Ryzen 9 7950X3D | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.13
AMD Ryzen 9 7950X3D | 3 | HEAD | 1.00
AMD Ryzen 9 7950X3D | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
AMD Ryzen 9 7950X3D | 3 | v5-add-mul_var_int.patch | 0.98
AMD Ryzen 9 7950X3D | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.15
Apple M3 Max | 1 | HEAD | 1.00
Apple M3 Max | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07
Apple M3 Max | 1 | v5-add-mul_var_int.patch | 1.08
Apple M3 Max | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.07
Apple M3 Max | 2 | HEAD | 1.00
Apple M3 Max | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Apple M3 Max | 2 | v5-add-mul_var_int.patch | 1.21
Apple M3 Max | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.10
Apple M3 Max | 3 | HEAD | 1.00
Apple M3 Max | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Apple M3 Max | 3 | v5-add-mul_var_int.patch | 0.99
Apple M3 Max | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.09
Intel Core i9-14900K | 1 | HEAD | 1.00
Intel Core i9-14900K | 1 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.05
Intel Core i9-14900K | 1 | v5-add-mul_var_int.patch | 1.07
Intel Core i9-14900K | 1 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 2 | HEAD | 1.00
Intel Core i9-14900K | 2 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 2 | v5-add-mul_var_int.patch | 1.08
Intel Core i9-14900K | 2 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.06
Intel Core i9-14900K | 3 | HEAD | 1.00
Intel Core i9-14900K | 3 | v4-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04
Intel Core i9-14900K | 3 | v5-add-mul_var_int.patch | 1.00
Intel Core i9-14900K | 3 | v5-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch | 1.04
(36 rows)

The queries to produce the above are in bench_csv_queries.txt

/Joel

Attachment Content-Type Size
bench_csv_queries.txt text/plain 1016 bytes
bench.csv text/csv 11.9 KB
bench_mul_var.sql application/octet-stream 9.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-07-04 07:44:30 Re: Logical Replication of sequences
Previous Message Alexander Kukushkin 2024-07-04 07:15:46 Non-superuser can't relocated its own trusted extensions