Re: Fix for edge case in date_bin() function

From: Moaaz Assali <ma5679(at)nyu(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Fix for edge case in date_bin() function
Date: 2024-02-28 16:11:00
Message-ID: CALkF+nsG9E39AQ-CJti9=GnD-S8NjGzQ=twKbVxuS+sygdU-2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Yeah you are right about the integer overflow.

Consider this query: select date_bin('15 minutes'::interval, timestamp
'294276-12-30 10:24:00', timestamp '4000-12-20 23:00:00 BC');

It will return 294276-12-30 10:31:49.551616 when it should be 294276-12-30
10:15:00, which happens because source timestamp is close to INT64_MAX and
origin timestamp is negative, causing an integer overflow.
So, the subsequent calculations are wrong.

I fixed the integer overflow and original bug and added test cases in my 3
patches in this reply below:

v2-0001:
- Fixed both the original bug discussed and the integer overflow issues
(and used your suggestion to store the modulo result).
- Any timestamp used will output the correct date_bin() result.
- I have used INT64 -> UINT64 mapping in order to ensure no integer
overflows are possible.
- The only additional cost is 3 subtractions/addition to do the INT64 ->
UINT64 mapping for timestamp, origin and final result.
- It is probably possible to tackle the integer overflow issue without
casting but, from my attempts, it seemed much more convoluted and complex.
- Implemented the fix for both timestamp_bin() and timestamptz_bin().

v2-0002:
- Added multiple test cases in timestamp.sql for integer overflow by
testing with timestamps around INT64_MIN and INT64_MAX.
- Added test case for the original bug where source timestamp is a valid
binned date already and does not require the additional stride interval
subtraction.

v2-0003:
- Exactly the same as v2-0002 but for timestamptz.sql

Also, I would like to create a new patch on the 2024-03 commitfest, but
since I just created my account yesterday I get this error:
"The site you are trying to log in to (commitfest.postgresql.org) requires
a cool-off period between account creation and logging in. You have not
passed the cool off period yet."

How long is the cool off period, so that I can create a new patch in the
commitfest before submissions close after tomorrow.
Alternatively, is it possible for someone to open a new patch on my behalf
linking this email thread, so it can be added to the 2024-03 commitfest?

Best regards,
Moaaz Assali

On Tue, Feb 27, 2024 at 11:13 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I wrote:
> > Hmm, yeah. The "stride_usecs > 1" test seems like it's a partial
> > attempt to account for this that is probably redundant given the
> > additional condition. Also, can we avoid computing tm_diff %
> > stride_usecs twice? Maybe the compiler is smart enough to remove the
> > common subexpression, but it's a mighty expensive computation if not.
>
> I think we could do it like this:
>
> tm_diff = timestamp - origin;
> tm_modulo = tm_diff % stride_usecs;
> tm_delta = tm_diff - tm_modulo;
> /* We want to round towards -infinity when tm_diff is negative */
> if (tm_modulo < 0)
> tm_delta -= stride_usecs;
>
> Excluding tm_modulo == 0 from the correction is what fixes the
> problem.
>
> > I'm also itching a bit over whether there are integer-overflow
> > hazards here. Maybe the range of timestamp is constrained enough
> > that there aren't, but I didn't look hard.
>
> Hmm, it's not. For instance this triggers the overflow check in
> timestamp_mi:
>
> regression=# select '294000-01-01'::timestamp - '4714-11-24 00:00:00 BC';
> ERROR: interval out of range
> regression=# \errverbose
> ERROR: 22008: interval out of range
> LOCATION: timestamp_mi, timestamp.c:2832
>
> So we ought to guard the subtraction that produces tm_diff similarly.
> I suspect it's also possible to overflow int64 while computing
> stride_usecs.
>
> regards, tom lane
>

Attachment Content-Type Size
v2-0002-Regression-test-in-timestamp.sql-for-date_bin.patch application/octet-stream 4.6 KB
v2-0001-Fix-for-edge-cases-and-integer-overflow-in-date_bin.patch application/octet-stream 5.5 KB
v2-0003-Regression-test-in-timestamptz.sql-for-date_bin.patch application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Vladimir Sitnikov 2024-02-28 16:51:00 Re: When extended query protocol ends?
Previous Message Alvaro Herrera 2024-02-28 16:07:13 pgsql: Improve performance of subsystems on top of SLRU