Re: Remove dependence on integer wrapping

From: Joseph Koshakow <koshy44(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Matthew Kim <matthewkmkim(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Remove dependence on integer wrapping
Date: 2024-08-17 21:52:48
Message-ID: CAAvxfHeWeETpiwxn11NvWkWd4o0curubg4cTXuk6yOGtixzAdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>> SET temp_buffers TO 1000000000;
>>>
>>> CREATE TEMP TABLE t(i int PRIMARY KEY);
>>> INSERT INTO t VALUES(1);
>>>
>>> #4 0x00007f385cdd37f3 in __GI_abort () at ./stdlib/abort.c:79
>>> #5 0x00005620071c4f51 in __addvsi3 ()
>>> #6 0x0000562007143f3c in init_htab (hashp=0x562008facb20,
nelem=610070812) at dynahash.c:720
>>>
>>> (gdb) f 6
>>> #6 0x0000560915207f3c in init_htab (hashp=0x560916039930,
nelem=1000000000) at dynahash.c:720
>>> 720 hctl->high_mask = (nbuckets << 1) - 1;
>>> (gdb) p nbuckets
>>> $1 = 1073741824
>>
>> Here's what it looks like is happening:
>>
>> 1. When inserting into the table, we create a new dynamic hash table
>> and set `nelem` equal to `temp_buffers`, which is 1000000000.
>>
>> 2. `nbuckets` is then set to the the next highest power of 2 from
>> `nelem`, which is 1073741824.
>>
>> /*
>> * Allocate space for the next greater power of two number of
buckets,
>> * assuming a desired maximum load factor of 1.
>> */
>> nbuckets = next_pow2_int(nelem);
>>
>> 3. Shift `nbuckets` to the left by 1. This would equal 2147483648,
>> which is larger than `INT_MAX`, which causes an overflow.
>>
>> hctl->high_mask = (nbuckets << 1) - 1;
>>
>> The max value allowed for `temp_buffers` is `INT_MAX / 2` (1073741823),
>> So any value of `temp_buffers` in the range (536870912, 1073741823]
>> would cause this overflow. Without `-ftrapv`, `nbuckets` would wrap
>> around to -2147483648, which is likely to cause all sorts of havoc, I'm
>> just not sure what exactly.
>>
>> Also, `nbuckets = next_pow2_int(nelem);`, by itself is a bit sketchy
>> considering that `nelem` is a `long` and `nbuckets` is an `int`.
>> Potentially, the fix here is to just convert `nbuckets` to a `long`. >>
I plan on checking if that's feasible.
> Yeah, the minimum value that triggers the trap is 536870913 and the
maximum
> accepted is 1073741823.
>
> Without -ftrapv, hctl->high_mask is set to 2147483647 on my machine,
> when nbuckets is 1073741824, and the INSERT apparently succeeds.

> I've taken a look at this and my current proposal is to convert
> `nbuckets` to 64 bit integer which would prevent the overflow. I'm
> hoping to look into if this is feasible soon.

I've both figured out why the INSERT still succeeds and a simple
solution to this. After `nbuckets` wraps around to -2147483648, we
subtract 1 which causes it to wrap back around to 2147483647. Which
explains the result seen by Alexander.

By the way,

>> Also, `nbuckets = next_pow2_int(nelem);`, by itself is a bit sketchy
>> considering that `nelem` is a `long` and `nbuckets` is an `int`.

It turns out I was wrong about this, `next_pow2_int` will always return
a value that fits into an `int`.

hctl->high_mask = (nbuckets << 1) - 1;

This calculation is used to ultimately populate the field
`uint32 high_mask`. I'm not very familiar with this hash table
implementation and I'm not entirely sure what it would take to convert
this to a `uint64`, but from poking around it looks like it would have
a huge blast radius.

The largest possible (theoretical) value for `nbuckets` is
`1073741824`, the largest power of 2 that fits into an `int`. So, the
largest possible value for `nbuckets << 1` is `2147483648`. This can
fully fit in a `uint32`, so the simple fix for this case is to cast
`nbuckets` to a `uint32` before shifting. I've attached this fix,
Alexander if you have time I would appreciate if you were able to test
it.

I noticed another potential issue with next_pow2_int. The
implementation is in dynahash.c and is as follows

/* calculate first power of 2 >= num, bounded to what will fit in an
int */
static int
next_pow2_int(long num)
{
if (num > INT_MAX / 2)
num = INT_MAX / 2;
return 1 << my_log2(num);
}

I'm pretty sure that `INT_MAX / 2` is not a power of 2, as `INT_MAX`
is not a power of 2. It should be `num = INT_MAX / 2 + 1;` I've also
attached a patch with this fix.

Thanks,
Joseph Koshakow

Attachment Content-Type Size
v25-0003-Fix-next_pow2_int.patch text/x-patch 813 bytes
v25-0002-Remove-dependence-on-fwrapv-semantics-in-dynahas.patch text/x-patch 956 bytes
v25-0001-Remove-dependence-on-fwrapv-semantics-in-some-da.patch text/x-patch 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-08-17 23:12:22 Re: Restart pg_usleep when interrupted
Previous Message Wolfgang Walther 2024-08-17 21:43:37 Re: Building with meson on NixOS/nixpkgs