Quick Links

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.

From:	"Joel Jacobson" <joel(at)compiler(dot)org>
To:	"Dean Rasheed" <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc:	Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands.
Date:	2024-07-03 13:48:58
Message-ID:	018dd6e8-8f21-4b3b-8d80-2d128f6d00b4@app.fastmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Jul 3, 2024, at 13:17, Dean Rasheed wrote:
> On Tue, 2 Jul 2024 at 21:10, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>>
>> I found the bug in the case 3 code,
>> and it turns out the same type of bug also exists in the case 2 code:
>>
>> case 2:
>> newdig = (int) var1digits[1] * var2digits[res_ndigits - 4];
>>
>> The problem here is that res_ndigits could become less than 4,
>
> Yes. It can't be less than 3 though (per an earlier test), so the case
> 2 code was correct.

Hmm, I don't see how the case 2 code can be correct?
If, like you say, res_ndigits can't be less than 3, that means it can be 3, right?
And if res_ndigits=3 then `var2digits[res_ndigits - 4]` would try to access `var2digits[-1]`.

> I've been hacking on this a bit and trying to tidy it up. Firstly, I
> moved it to a separate function, because it was starting to look messy
> having so much extra code in mul_var(). Then I added a bunch more
> comments to explain what's going on, and the limits of the various
> variables. Note that most of the boundary checks are actually
> unnecessary -- in particular all the ones in or after the main loop,
> provided you pull out the first 2 result digits from the main loop in
> the 3-digit case. That does seem to work very well, but...

Nice, I was starting to feel a bit uncomfortable with the level of increased complexity.

> I wasn't entirely happy with how messy that code is getting, so I
> tried a different approach. Similar to div_var_int(), I tried writing
> a mul_var_int() function instead. This can be used for 1 and 2 digit
> factors, and we could add a similar mul_var_int64() function on
> platforms with 128-bit integers. The code looks quite a lot neater, so
> it's probably less likely to contain bugs (though I have just written
> it in a hurry,so it might still have bugs). In testing, it seemed to
> give a decent speedup, but perhaps a little less than before. But
> that's to be balanced against having more maintainable code, and also
> a function that might be useful elsewhere in numeric.c.
>
> Anyway, here are both patches for comparison. I'll stop hacking for a
> while and let you see what you make of these.

I've tested both patches, and they produces the same output given the
same input as HEAD, when rscale is unmodified (full precision).

However, for a reduced rscale, there are some differences:

mul_var_small() seems more resilient to rscale reductions than mul_var_int().

The previous version we worked on, I've called "mul_var inlined" in the output below.

```
CREATE TABLE test_numeric_mul_patched (
var1 numeric,
var2 numeric,
rscale_adjustment int,
result numeric
);

DO $$
DECLARE
var1 numeric;
var2 numeric;
BEGIN
FOR i IN 1..1000 LOOP
RAISE NOTICE '%', i;
FOR var1ndigits IN 1..4 LOOP
FOR var2ndigits IN 1..4 LOOP
FOR var1dscale IN 0..(var1ndigits*4) LOOP
FOR var2dscale IN 0..(var2ndigits*4) LOOP
FOR rscale_adjustment IN 0..(var1dscale+var2dscale) LOOP
var1 := round(random(
format('1%s',repeat('0',(var1ndigits-1)*4-1))::numeric,
format('%s',repeat('9',var1ndigits*4))::numeric
) / 10::numeric^var1dscale, var1dscale);
var2 := round(random(
format('1%s',repeat('0',(var2ndigits-1)*4-1))::numeric,
format('%s',repeat('9',var2ndigits*4))::numeric
) / 10::numeric^var2dscale, var2dscale);
INSERT INTO test_numeric_mul_patched
(var1, var2, rscale_adjustment)
VALUES
(var1, var2, -rscale_adjustment);
END LOOP;
END LOOP;
END LOOP;
END LOOP;
END LOOP;
END LOOP;
END $$;

UPDATE test_numeric_mul_patched SET result = numeric_mul_head(var1, var2, rscale_adjustment);

SELECT
rscale_adjustment,
COUNT(*),
COUNT(*) FILTER (WHERE result IS DISTINCT FROM numeric_mul_patch_int(var1,var2,rscale_adjustment)) AS "mul_var_int",
COUNT(*) FILTER (WHERE result IS DISTINCT FROM numeric_mul_patch_small(var1,var2,rscale_adjustment)) AS "mul_var_small",
COUNT(*) FILTER (WHERE result IS DISTINCT FROM numeric_mul_patch_inline(var1,var2,rscale_adjustment)) AS "mul_var inlined"
FROM test_numeric_mul_patched
GROUP BY 1
ORDER BY 1;

rscale_adjustment | count | mul_var_int | mul_var_small | mul_var inlined
-------------------+---------+-------------+---------------+-----------------
-32 | 1000 | 0 | 0 | 0
-31 | 3000 | 0 | 0 | 0
-30 | 6000 | 0 | 0 | 0
-29 | 10000 | 0 | 0 | 0
-28 | 17000 | 0 | 0 | 0
-27 | 27000 | 0 | 0 | 0
-26 | 40000 | 0 | 1 | 0
-25 | 56000 | 1 | 11 | 0
-24 | 78000 | 316 | 119 | 1
-23 | 106000 | 498 | 1696 | 0
-22 | 140000 | 531 | 2480 | 1
-21 | 180000 | 591 | 3145 | 0
-20 | 230000 | 1956 | 5309 | 1
-19 | 290000 | 2189 | 5032 | 0
-18 | 360000 | 2314 | 4868 | 0
-17 | 440000 | 2503 | 4544 | 1
-16 | 533000 | 5201 | 3633 | 0
-15 | 631000 | 5621 | 3006 | 0
-14 | 734000 | 5907 | 2631 | 0
-13 | 842000 | 6268 | 2204 | 0
-12 | 957000 | 9558 | 778 | 0
-11 | 1071000 | 10597 | 489 | 0
-10 | 1184000 | 10765 | 193 | 0
-9 | 1296000 | 9452 | 0 | 0
-8 | 1408000 | 1142 | 0 | 0
-7 | 1512000 | 391 | 0 | 0
-6 | 1608000 | 235 | 0 | 0
-5 | 1696000 | 0 | 0 | 0
-4 | 1776000 | 0 | 0 | 0
-3 | 1840000 | 0 | 0 | 0
-2 | 1888000 | 0 | 0 | 0
-1 | 1920000 | 0 | 0 | 0
0 | 1936000 | 0 | 0 | 0
(33 rows)

SELECT
result - numeric_mul_patch_int(var1,var2,rscale_adjustment),
COUNT(*)
FROM test_numeric_mul_patched
GROUP BY 1
ORDER BY 1;

?column? | count
----------------+----------
0 | 24739964
0.000000000001 | 2170
0.00000000001 | 234
0.0000000001 | 18
0.000000001 | 4
0.00000001 | 8927
0.0000001 | 882
0.000001 | 90
0.00001 | 6
0.0001 | 21963
0.001 | 2174
0.01 | 214
0.1 | 18
1 | 39336
(14 rows)

SELECT
result - numeric_mul_patch_small(var1,var2,rscale_adjustment),
COUNT(*)
FROM test_numeric_mul_patched
GROUP BY 1
ORDER BY 1;

?column? | count
-------------------+----------
-1 | 1233
-0.01 | 9
-0.001 | 73
-0.0001 | 647
-0.000001 | 2
-0.0000001 | 9
-0.00000001 | 116
0.000000000000000 | 24775861
0.00000001 | 1035
0.00000002 | 2
0.0000001 | 96
0.000001 | 9
0.0001 | 8771
0.0002 | 3
0.001 | 952
0.01 | 69
0.1 | 10
1 | 27098
2 | 5
(19 rows)

SELECT
result - numeric_mul_patch_inline(var1,var2,rscale_adjustment),
COUNT(*)
FROM test_numeric_mul_patched
GROUP BY 1
ORDER BY 1;

?column? | count
----------+----------
-1 | 4
0 | 24815996
(2 rows)
```

I found these two interesting to look closer at:
```
0.00000002 | 2
0.0002 | 3

SELECT
*,
numeric_mul_patch_small(var1,var2,rscale_adjustment)
FROM test_numeric_mul_patched
WHERE result - numeric_mul_patch_small(var1,var2,rscale_adjustment) IN (0.00000002, 0.0002);

var1 | var2 | rscale_adjustment | result | numeric_mul_patch_small
-------------------+----------------+-------------------+------------+-------------------------
8952.12658563 | 0.902315486665 | -16 | 8077.6425 | 8077.6423
0.881715409579 | 0.843165739371 | -16 | 0.74343223 | 0.74343221
0.905322758954 | 0.756905996850 | -16 | 0.68524423 | 0.68524421
8464.043170546608 | 0.518100129611 | -20 | 4385.2219 | 4385.2217
5253.006296984449 | 0.989308019355 | -20 | 5196.8413 | 5196.8411
(5 rows)
```

What can be said about mul_var()'s contract with regards to rscale?
It's the number of decimal digits requested by the caller, and if not
requesting full precision, then the decimal digits might not be accurate,
but can something be said about how far off they can be?

The mul_var_int() patch only produces a difference that is exactly
1 less than the exact result, at the last non-zero decimal digit.

Could the difference be more than 1 at the last non-zero digit,
like in the five cases found above?

It would be nice if we could define mul_var()'s contract with regards to
rscale, in terms of what precision can be expected in the result.

Attaching the hacked together version with all the patches, used to do the testing above.

Regards,
Joel

Attachment	Content-Type	Size
test-mul-var-versions.patch.txt	text/plain	53.3 KB

In response to

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. at 2024-07-03 11:17:38 from Dean Rasheed

Responses

Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. at 2024-07-03 18:57:48 from Dean Rasheed
Re: Optimize numeric multiplication for one and two base-NBASE digit multiplicands. at 2024-07-03 19:05:51 from Joel Jacobson

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2024-07-03 14:08:48	Add a GUC check hook to ensure summarize_wal cannot be enabled when wal_level is minimal
Previous Message	Daniel Gustafsson	2024-07-03 13:46:56	Re: Useless parameter 'cur_skey' in IndexScanOK