Re: Bug in numeric multiplication

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in numeric multiplication
Date: 2015-11-17 23:57:43
Message-ID: 17848.1447804663@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> I pushed this, but while looking at it, my attention was drawn to this
> bit down near the end of the loop:

> /*
> * The dividend digit we are about to replace might still be nonzero.
> * Fold it into the next digit position. We don't need to worry about
> * overflow here since this should nearly cancel with the subtraction
> * of the divisor.
> */
> div[qi + 1] += div[qi] * NBASE;

> In the first place, I'm not feeling especially convinced by that handwave
> about there being no risk of overflow. In the second place, this is
> indisputably failing to consider whether maxdiv might need to increase.
> If I add
> Assert(Abs(div[qi + 1]) <= (maxdiv * (NBASE-1)));
> right after this, the regression tests blow up. So it looks to me like
> we've got some more work to do.

After thinking some more about what this is doing, it seems to me that
we could avoid changing div[qi + 1] at all here, and instead deal with
leftover dividend digits by shoving them into the floating-point part of
the calculation. All that we really need to have happen as a consequence
of the above is to inject an additional "div[qi] * NBASE" term into future
calculations of fdividend. So we can do that out-of-band and avoid
mucking up the maxdiv invariant.

Also, I realized that this bit:

/*
* All the div[] digits except possibly div[qi] are now in the
* range 0..NBASE-1.
*/
maxdiv = Abs(newdig) / (NBASE - 1);
maxdiv = Max(maxdiv, 1);

is unduly conservative, because, while indeed div[qi] might be out of
range, there is no need to consider it in maxdiv anymore, because the new
maxdiv value only determines what will happen in future loop iterations
where the current div[qi] will be irrelevant. So we should just reset
maxdiv to 1 unconditionally here.

So that led me to the attached patch, which passes regression tests fine.
I'm not sure that it's provably okay though. The loose ends are to show
that div[qi] can't overflow an int during the divisor-subtraction step
and that "outercarry" remains bounded. Testing suggests that outercarry
can't exceed INT_MAX/NBASE, but I don't see how to prove that.

regards, tom lane

Attachment Content-Type Size
div_var_fast-fixes.patch text/x-diff 4.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-18 00:18:54 Re: [COMMITTERS] pgsql: Cause TestLib.pm to define $windows_os in all branches.
Previous Message Kouhei Kaigai 2015-11-17 23:51:01 Re: Foreign join pushdown vs EvalPlanQual