Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date: 2024-07-06 09:34:18
Message-ID: CAEZATCUmrq4pgJdQNQendiPkmyM=qdnZv1g-vtoMyrOyCi6xdA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 5 Jul 2024 at 18:37, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> On Fri, Jul 5, 2024, at 18:42, Joel Jacobson wrote:
> > Very nice, v7-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch
> > is now the winner on all my CPUs:
>
> I thought it would be interesting to also measure the isolated effect
> on just numeric_mul() without the query overhead.
>
> Impressive speed-up, between 25% - 81%.
>

Cool. I think we should go with the mul_var_small() patch then, since
it's more generally applicable.

I also did some testing with much larger var2 values, and saw similar
speed-ups. One high-level function that benefits from that is
factorial(), which accepts inputs up to 32177, and so uses both the
1-digit and 2-digit code with very large var2 values. I doubt anyone
actually uses it with such large inputs, but it's interesting
nonetheless:

SELECT factorial(32177);
Time: 923.117 ms -- HEAD
Time: 534.375 ms -- mul_var_small() patch

I did one more round of (mostly cosmetic) copy-editing. Aside from
improving some of the comments, it occurred to me that there's no need
to pass rscale to mul_var_small(), or for it to call round_var(),
since it's always computing the exact result. That shaves off a few
more cycles.

Additionally, I didn't like how res_weight and res_ndigits were being
set 1 higher than they needed to be. That makes sense in mul_var()
because it may round the result, causing a non-zero carry to propagate
into the next digit up, but it's just confusing in mul_var_small(). So
I've reduced those by 1, which makes the look much more logical. To be
clear, this doesn't change how many digits we're calculating. But now
res_ndigits is actually the number of digits being calculated, whereas
before, res_ndigits was 1 larger and we were calculating res_ndigits -
1 digits, which was confusing.

I think this is good to go, so unless there are any further comments,
I plan to commit it soon.

Possible future work would be to try extending it to larger var1
values. I have a feeling that might work quite well for 5 or 6 digits,
but at some point, we'll start seeing diminishing returns, and the
code bloat won't be worth it.

Regards,
Dean

Attachment Content-Type Size
v8-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch text/x-patch 7.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pantelis Theodosiou 2024-07-06 09:51:20 Re: Should we document how column DEFAULT expressions work?
Previous Message Richard Guo 2024-07-06 09:32:41 Re: Should consider materializing the cheapest inner path in consider_parallel_nestloop()