From: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Wrong results from in_range() tests with infinite offset |
Date: | 2020-07-18 08:16:39 |
Message-ID: | CAEZATCVQO1vUXo0PjHToxvyZ8piwH-u-mTMqMhrGku2dYMhWbw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 17 Jul 2020 at 01:59, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> writes:
> > On Thu, 16 Jul 2020, 22:50 Tom Lane, <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Actually, after staring at those results awhile longer, I decided
> >> they were wrong. The results shown here seem actually sane ---
> >> for instance, -Infinity shouldn't "infinitely precede" itself,
> >> I think. (Maybe if you got solipsistic enough you could argue
> >> that that is valid, but it seems pretty bogus.)
>
> > Hmm, that code looks a bit fishy to me, but I really need to think about it
> > some more. I'll take another look tomorrow, and maybe it'll become clearer.
>
> It's certainly verbose, so I'd like to find a more concise way to
> write the logic. But the v2 results seem right.
>
I'm finding it hard to come up with a principled argument to say
exactly what the right results should be.
As things stand (pre-patch), a window frame defined as "BETWEEN 'inf'
PRECEDING AND 'inf' PRECEDING", produces the following:
id | f_float4 | first_value | last_value
----+-----------+-------------+------------
0 | -Infinity | |
1 | -3 | |
2 | -1 | |
3 | 0 | |
4 | 1.1 | |
5 | 1.12 | |
6 | 2 | |
7 | 100 | |
8 | Infinity | |
9 | NaN | 9 | 9
(10 rows)
which is clearly wrong, because -Inf obviously infinitely precedes all
the other (non-NaN) values.
With the first version of the patch, that result became
id | f_float4 | first_value | last_value
----+-----------+-------------+------------
0 | -Infinity | 0 | 0
1 | -3 | 0 | 0
2 | -1 | 0 | 0
3 | 0 | 0 | 0
4 | 1.1 | 0 | 0
5 | 1.12 | 0 | 0
6 | 2 | 0 | 0
7 | 100 | 0 | 0
8 | Infinity | 0 | 0
9 | NaN | 9 | 9
(10 rows)
which is definitely better, but the one obvious problem is last_value
for id=8, because all the values in earlier rows infinitely precede
+Inf, so they should be included in the window frame for that row.
With the second version of the patch, the result is
id | f_float4 | first_value | last_value
----+-----------+-------------+------------
0 | -Infinity | |
1 | -3 | 0 | 0
2 | -1 | 0 | 0
3 | 0 | 0 | 0
4 | 1.1 | 0 | 0
5 | 1.12 | 0 | 0
6 | 2 | 0 | 0
7 | 100 | 0 | 0
8 | Infinity | 0 | 7
9 | NaN | 9 | 9
(10 rows)
That fixes last_value for id=8, using the fact that all values less
than +Inf infinitely precede it, and also assuming that +Inf does not
infinitely precede itself, which seems reasonable.
The other change is in the first row, because it now assumes that -Inf
doesn't infinitely precede itself, which seems reasonable from a
consistency point of view.
However, that is also a bit odd because it goes against the documented
contract of in_range(), which is supposed to do the tests
val <= base +/- offset1
val >= base +/- offset2
which for "BETWEEN 'inf' PRECEDING AND 'inf' PRECEDING" become
val = base - Inf
which is -Inf, even if base = -Inf. So I'd say that the window
infinitely preceding -Inf contains -Inf, since -Inf - Inf = -Inf.
But if -Inf infinitely precedes -Inf, it probably also makes sense to
say that +Inf infinitely precedes +Inf for consistency, even though
that really isn't well-defined, since Inf - Inf = NaN. Doing that is
certainly a lot easier to code, because it just needs to return true
if base +/- offset would be NaN, i.e.,
/*
* Deal with cases where both base and offset are infinite, and computing
* base +/- offset would produce NaN. This corresponds to a window frame
* whose boundary infinitely precedes +inf or infinitely follows -inf,
* which is not well-defined. For consistency with other cases involving
* infinities, such as the fact that +inf infinitely follows +inf, we
* choose to assume that +inf infinitely precedes +inf and -inf infinitely
* follows -inf, and therefore that all finite and infinite values are in
* such a window frame.
*/
if (isinf(base) && isinf(offset))
{
if ((base > 0 && sub) || (base < 0 && !sub))
PG_RETURN_BOOL(true);
}
and the result is
id | f_float8 | first_value | last_value
----+-----------+-------------+------------
0 | -Infinity | 0 | 0
1 | -3 | 0 | 0
2 | -1 | 0 | 0
3 | 0 | 0 | 0
4 | 1.1 | 0 | 0
5 | 1.12 | 0 | 0
6 | 2 | 0 | 0
7 | 100 | 0 | 0
8 | Infinity | 0 | 8
9 | NaN | 9 | 9
(10 rows)
which looks about equally sensible. To me, the fact that the window
infinitely preceding -Inf includes -Inf makes more sense, but the
meaning of the window infinitely preceding +Inf is much less obvious,
and not really well-defined.
Regards,
Dean
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2020-07-18 09:15:28 | Re: Patch for reorderbuffer.c documentation. |
Previous Message | John Naylor | 2020-07-18 08:00:12 | Re: factorial function/phase out postfix operators? |