From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Subject: | Re: BRIN minmax multi - incorrect distance for infinite timestamp/date |
Date: | 2023-10-13 10:44:38 |
Message-ID: | 14c205c6-fbf4-d2f2-3ea8-a7d5c5d63180@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/13/23 11:21, Dean Rasheed wrote:
> On Thu, 12 Oct 2023 at 23:43, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> Ashutosh Bapat reported me off-list a possible issue in how BRIN
>> minmax-multi calculate distance for infinite timestamp/date values.
>>
>> The current code does this:
>>
>> if (TIMESTAMP_NOT_FINITE(dt1) || TIMESTAMP_NOT_FINITE(dt2))
>> PG_RETURN_FLOAT8(0);
>>
>
> Yes indeed, that looks wrong. I noticed the same thing while reviewing
> the infinite interval patch.
>
>> so means infinite values are "very close" to any other value, and thus
>> likely to be merged into a summary range. That's exactly the opposite of
>> what we want to do, possibly resulting in inefficient indexes.
>>
>
> Is this only inefficient? Or can it also lead to wrong query results?
>
I don't think it can produce incorrect results. It only affects which
values we "merge" into an interval when building the summaries.
>> Attached is a patch fixing this
>>
>
> I wonder if it's actually necessary to give infinity any special
> handling at all for dates and timestamps. For those types, "infinity"
> is actually just INT_MIN/MAX, which compares correctly with any finite
> value, and will be much larger/smaller than any common value, so it
> seems like it isn't necessary to give "infinite" values any special
> treatment. That would be consistent with date_cmp() and
> timestamp_cmp().
>
Right, but ....
> Something else that looks wrong about that BRIN code is that the
> integer subtraction might lead to overflow -- it is subtracting two
> integer values, then casting the result to float8. It should cast each
> input before subtracting, more like brin_minmax_multi_distance_int8().
>
... it also needs to fix this, otherwise it overflows. Consider
delta = dt2 - dt1;
and assume dt1 is INT64_MIN, or that dt2 is INT64_MAX.
> IOW, I think brin_minmax_multi_distance_date/timestamp could be made
> basically identical to brin_minmax_multi_distance_int4/8.
>
Right. Attached is a patch doing it this way.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
brin-infinity-fix-v2.patch | text/x-patch | 1.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2023-10-13 11:43:02 | RE: pg_upgrade's interaction with pg_resetwal seems confusing |
Previous Message | Amit Kapila | 2023-10-13 10:44:30 | Re: pg_upgrade's interaction with pg_resetwal seems confusing |