Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>
Cc: Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date: 2024-07-04 18:43:10
Message-ID: CAEZATCXVB15_RfzROFiO_7Fh3voOuOe7FotbvON-6a+ManuGJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 3 Jul 2024 at 21:45, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> > On Wed, Jul 3, 2024, at 20:57, Dean Rasheed wrote:
> >> I wouldn't expect it to ever be off by more than 1
> >
> > OK, so then the cases I found where it was off by 2 for the mul_var_int() patch
> > are unexpected?
>
> Sorry, I meant off by 2 for the mul_var_small() patch, these cases that I found:
>

Yeah, so that was another bug in mul_var_small(). If rscale is made
small enough, the result index for the digits computed before the main
loop overlaps the ones after, so it would overwrite digits already
computed.

Of course, that's fairly easy to fix, but at this point I think the
better solution is to only use mul_var_small() when an exact product
is requested. We would have to do that for mul_var_int() anyway,
because of its accuracy issues discussed earlier. I think this is a
reasonable thing to do because only functions like ln_var() and
exp_var() will ask mul_var() for a reduced-rscale result, and those
functions are likely to be dominated by computations involving larger
numbers, for which this patch wouldn't help anyway. Also those
functions are probably less widely used.

If we make that decision, a lot of the complexity in mul_var_small()
goes away, including all the conditional array accesses, making it
simpler and more efficient. v6 patch attached.

I also updated the mul_var_int() patch so that it is also only invoked
when an exact product is requested, and I noticed a couple of other
minor optimisations that could be made. Then I decided to try
implementing mul_var_int64(). This gives a pretty decent speedup for
3-digit inputs, but unfortunately it is much slower for 4-digit inputs
(for which most values will go through the 128-bit code path). I'm
attaching that too, just for information, but it's clearly not going
to be acceptable as-is.

Running your benchmark queries, I got these results:

SELECT SUM(var1*var2) FROM bench_mul_var_var1ndigits_1;
Time: 4520.874 ms (00:04.521) -- HEAD
Time: 3937.536 ms (00:03.938) -- v5-mul_var_int.patch
Time: 3919.495 ms (00:03.919) -- v5-mul_var_small.patch
Time: 3916.964 ms (00:03.917) -- v6-mul_var_int64.patch
Time: 3811.118 ms (00:03.811) -- v6-mul_var_small.patch

SELECT SUM(var1*var2) FROM bench_mul_var_var1ndigits_2;
Time: 4762.528 ms (00:04.763) -- HEAD
Time: 4075.546 ms (00:04.076) -- v5-mul_var_int.patch
Time: 4055.180 ms (00:04.055) -- v5-mul_var_small.patch
Time: 4037.866 ms (00:04.038) -- v6-mul_var_int64.patch
Time: 4018.488 ms (00:04.018) -- v6-mul_var_small.patch

SELECT SUM(var1*var2) FROM bench_mul_var_var1ndigits_3;
Time: 5387.514 ms (00:05.388) -- HEAD
Time: 5350.736 ms (00:05.351) -- v5-mul_var_int.patch
Time: 4648.449 ms (00:04.648) -- v5-mul_var_small.patch
Time: 4655.204 ms (00:04.655) -- v6-mul_var_int64.patch
Time: 4645.962 ms (00:04.646) -- v6-mul_var_small.patch

SELECT SUM(var1*var2) FROM bench_mul_var_var1ndigits_4;
Time: 5617.150 ms (00:05.617) -- HEAD
Time: 5505.913 ms (00:05.506) -- v5-mul_var_int.patch
Time: 5486.441 ms (00:05.486) -- v5-mul_var_small.patch
Time: 8203.081 ms (00:08.203) -- v6-mul_var_int64.patch
Time: 5598.909 ms (00:05.599) -- v6-mul_var_small.patch

So v6-mul_var_int64 improves on v5-mul_var_int in the 3-digit case,
but is terrible in the 4-digit case. None of the other patches touch
the 4-digit case, but it might be interesting to try mul_var_small()
with 4 digits.

Regards,
Dean

Attachment Content-Type Size
v6-optimize-numeric-mul_var-small-var1-arbitrary-var2.patch text/x-patch 6.3 KB
v6-add-mul_var_int64.patch text/x-patch 9.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-07-04 18:44:39 Re: PostgreSQL does not compile on macOS SDK 15.0
Previous Message Tom Lane 2024-07-04 18:37:50 Re: PostgreSQL does not compile on macOS SDK 15.0