From: | "Joel Jacobson" <joel(at)compiler(dot)org> |
---|---|
To: | "Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Optimising numeric division |
Date: | 2024-08-23 19:21:23 |
Message-ID: | a781c4c1-2b65-49d7-b751-80abf09cf5e6@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 23, 2024, at 15:49, Dean Rasheed wrote:
> Currently numeric.c has 2 separate functions that implement numeric
> division, div_var() and div_var_fast(). Since div_var_fast() is only
> approximate, it is only used in the transcendental functions, where a
> slightly inaccurate result is OK. In all other cases, div_var() is
> used.
...
> The attached patch attempts to resolve those issues by replacing
> div_var() and div_var_fast() with a single function intended to be
> faster than both the originals.
...
> In addition, like mul_var(), div_var() now does the computation in
> base NBASE^2, using 64-bit integers for the working dividend array,
> which halves the number of iterations in the outer loop and reduces
> the frequency of carry-propagation passes.
...
> In the testing I've done so far, this is up to around 20 times faster
> than the old version of div_var() when "exact" is true (it computes
> each of the queries above in around 1.6 seconds), and it's up to
> around 2-3 times faster than div_var_fast() when "exact" is false.
>
> In addition, I found that it's significantly faster than
> div_var_int64() for 3 and 4 digit divisors, so I've removed that as
> well.
...
> Overall, this reduces numeric.c by around 300 lines, and means that
> we'll only have one function to maintain.
Impressive simplifications and optimizations.
Very happy to see reuse of the NBASE^2 trick.
> Patch 0001 is the patch itself. 0002 is just for testing purposes --
> it exposes both old functions and the new one as SQL-callable
> functions with all their arguments, so they can be compared.
>
> I'm also attaching the performance test script I used, and the output
> it produced.
I've had an initial look at the code and it looks straight-forward,
thanks to most of the complicated parts of the changed code is just
change of NBASE to NBASE_SQR.
I think the comments are of high quality.
I've run perf_test.sql on my three machines, without any errors.
Output files attached.
Regards,
Joel
Attachment | Content-Type | Size |
---|---|---|
perf_test-M3 Max.out | application/octet-stream | 53.0 KB |
perf_test-Intel Core i9-14900K.out | application/octet-stream | 48.1 KB |
perf_test-AMD Ryzen 9 7950X3D.out | application/octet-stream | 48.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-08-23 19:32:11 | Re: On disable_cost |
Previous Message | Heikki Linnakangas | 2024-08-23 19:12:55 | Re: On disable_cost |