Re: Remove dependence on integer wrapping

From: Joseph Koshakow <koshy44(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Remove dependence on integer wrapping
Date: 2024-07-19 23:32:18
Message-ID: CAAvxfHfdPgtqFqsb=Nxa1oc6GsQ_U+k3rfneo-EjYG4bne_ksw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 19, 2024 at 2:45 PM Nathan Bossart <nathandbossart(at)gmail(dot)com>
wrote:
>
> + /* dim[i] = 1 + upperIndx[i] - lowerIndx[i]; */
> + if (pg_add_s32_overflow(1, upperIndx[i], &dim[i]))
> + ereport(ERROR,
> + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> + errmsg("array upper bound is too large: %d",
> + upperIndx[i])));
> + if (pg_sub_s32_overflow(dim[i], lowerIndx[i], &dim[i]))
> + ereport(ERROR,
> + (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> + errmsg("array size exceeds the maximum allowed
(%d)",
> + (int) MaxArraySize)));
>
> I think the problem with fixing it this way is that it prohibits more than
> is necessary.

My understanding is that 2147483647 (INT32_MAX) is not a valid upper
bound, which is what the first overflow check is checking. Any query of
the form
`INSERT INTO arroverflowtest(i[<lb>:2147483647]) VALUES ('{...}');`
will fail with an error of
`ERROR: array lower bound is too large: <lb>`

The reason is the following bounds check found in arrayutils.c

/*
* Verify sanity of proposed lower-bound values for an array
*
* The lower-bound values must not be so large as to cause overflow when
* calculating subscripts, e.g. lower bound 2147483640 with length 10
* must be disallowed. We actually insist that dims[i] + lb[i] be
* computable without overflow, meaning that an array with last
subscript
* equal to INT_MAX will be disallowed.
*
* It is assumed that the caller already called ArrayGetNItems, so that
* overflowed (negative) dims[] values have been eliminated.
*/
void
ArrayCheckBounds(int ndim, const int *dims, const int *lb)
{
(void) ArrayCheckBoundsSafe(ndim, dims, lb, NULL);
}

/*
* This entry point can return the error into an ErrorSaveContext
* instead of throwing an exception.
*/
bool
ArrayCheckBoundsSafe(int ndim, const int *dims, const int *lb,
struct Node *escontext)
{
int i;

for (i = 0; i < ndim; i++)
{
/* PG_USED_FOR_ASSERTS_ONLY prevents variable-isn't-read warnings */
int32 sum PG_USED_FOR_ASSERTS_ONLY;

if (pg_add_s32_overflow(dims[i], lb[i], &sum))
ereturn(escontext, false,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("array lower bound is too large: %d",
lb[i])));
}

return true;
}

Specifically "We actually insist that dims[i] + lb[i] be computable
without overflow, meaning that an array with last subscript equal to
INT32_MAX will be disallowed." If the upper bound is INT32_MAX,
then there's no lower bound where (lower_bound + size) won't overflow.

It might be possible to remove this restriction, but it's probably
easier to keep it.

> An easy way to deal with this problem is to first perform the calculation
> with everything cast to an int64. Before setting dim[i], you'd check that
> the result is in [PG_INT32_MIN, PG_INT32_MAX] and fail if needed.
>
> int64 newdim;
>
> ...
>
> newdim = (int64) 1 + (int64) upperIndx[i] - (int64) lowerIndx[i];
> if (unlikely(newdim < PG_INT32_MIN || newdim > PG_INT32_MAX))
> ereport(ERROR,
> ...
> dim[i] = (int32) newdim;

I've rebased my patches and updated 0002 with this approach if this is
still the approach you want to go with. I went with the array size too
large error for similar reasons as the previous version of the patch.

Since the patches have been renumbered, here's an overview of their
status:

- 0001 is reviewed and waiting for v18.
- 0002 is under review and a bug fix.
- 0003 needs review.

Thanks,
Joseph Koshakow

Attachment Content-Type Size
v13-0002-Remove-overflow-from-array_set_slice.patch text/x-patch 3.7 KB
v13-0001-Remove-dependence-on-integer-wrapping.patch text/x-patch 13.5 KB
v13-0003-Remove-dependence-on-integer-wrapping-for-jsonb.patch text/x-patch 2.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul George 2024-07-19 23:32:22 Re: behavior of GROUP BY with VOLATILE expressions
Previous Message Wen Yi 2024-07-19 23:26:46 Re:How can udf c function return table, not the rows?