Re: Support for RANGE ... PRECEDING windows in OVER

From: ian link <ian(at)ilink(dot)io>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support for RANGE ... PRECEDING windows in OVER
Date: 2013-07-01 03:54:51
Message-ID: CAOOwM5KajDqb=M8EXoQaV2nmDAo8AvE8H26HnxC_obtQf2_DNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I found some time and I think I am up to speed now. I finally figured out
how to add new operator strategies and made a little test operator for
myself.

It seems pretty clear that assuming '+' and '-' are addition and
subtraction is a bad idea. I don't think it would be too tricky to add
support for new operator strategies. Andrew Gierth suggested calling these
new strategies "offset -" and "offset +", which I think describes it pretty
well. I assigned the operator itself to be "@+" and "@-" but that can
obviously be changed. If this sounds like a good path to you guys, I will
go ahead and implement the operators for the appropriate types. Please let
me know if I am misunderstanding something - I am still figuring stuff out
:)

Aside from the opclass stuff, there were some other important issues
mentioned with the original RANGE support. I think I will address those
after the opclass stuff is done.

Thanks!
Ian

On Sat, Jun 22, 2013 at 4:38 PM, ian link <ian(at)ilink(dot)io> wrote:

> Thanks Craig! That definitely does help. I probably still have some
> questions but I think I will read through the rest of the code before
> asking. Thanks again!
>
> Ian
>
> > Craig Ringer
> > Friday, June 21, 2013 8:41 PM
>
> >
> > On 06/22/2013 03:30 AM, ian link wrote:
> >>
> >> Forgive my ignorance, but I don't entirely understand the problem. What
> >> does '+' and '-' refer to exactly?
> >
> > Consider "RANGE 4.5 PRECEDING'.
> >
> > You need to be able to test whether, for the current row 'b', any given
> > row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
> > < vs <= boundaries, but that's irrelevant for the example.
> >
> > To test that, you have to be able to do two things: you have to be able
> > to test whether one value is greater than another, and you have to be
> > able to add or subtract a constant from one of the values.
> >
> > Right now, the b-tree access method provides information on the ordering
> > operators < <= = > >= <> , which provides half the answer. But these
> > don't give any concept of *distance* - you can test ordinality but not
> > cardinality.
> >
> > To implement the "different by 4.5" part, you have to be able to add 4.5
> > to one value or subtract it from the other.
> >
> > The obvious way to do that is to look up the function that implements
> > the '+' or '-' operator, and do:
> >
> > ((OPERATOR(+))(a, 4.5)) > b AND (a <= b)
> >
> > or
> >
> > ((OPERATOR(-))(b, 4.5)) < a AND (a <= b);
> >
> > The problem outlined by Tom in prior discussion about this is that
> > PostgreSQL tries really hard not to assume that particular operator
> > names mean particular things. Rather than "knowing" that "+" is always
> > "an operator that adds two values together; is transitive, symmetric and
> > reflexive", PostgreSQL requires that you define an *operator class* that
> > names the operator that has those properties.
> >
> > Or at least, it does for less-than, less-than-or-equals, equals,
> > greater-than-or-equals, greater-than, and not-equals as part of the
> > b-tree operator class, which *usually* defines these operators as < <= =
> >>
> >> = > <>, but you could use any operator names you wanted if you really
> >
> > liked.
> >
> > Right now (as far as I know) there's no operator class that lets you
> > identify operators for addition and subtraction in a similar way. So
> > it's necessary to either add such an operator class (in which case
> > support has to be added for it for every type), extend the existing
> > b-tree operator class to provide the info, or blindly assume that "+"
> > and "-" are always addition and subtraction.
> >
> > For an example of why such assumptions are a bad idea, consider matrix
> > multiplication. Normally, "a * b" = "b * a", but this isn't true for
> > multiplication of matrices. Similarly, if someone defined a "+" operator
> > as an alias for string concatenation (||), we'd be totally wrong to
> > assume we could use that for doing range-offset windowing.
> >
> > So. Yeah. Operator classes required, unless we're going to change the
> > rules and make certain operator names "special" in PostgreSQL, so that
> > if you implement them they *must* have certain properties. This seems
> > like a pretty poor reason to add such a big change.
> >
> > I hope this explanation (a) is actually correct and (b) is helpful.
> >
> > ian link
> > Friday, June 21, 2013 12:30 PM
>
> > Forgive my ignorance, but I don't entirely understand the problem. What
> does '+' and '-' refer to exactly?
> > Thanks!
> >
> >
> >
> > Hitoshi Harada
> > Friday, June 21, 2013 4:35 AM
> >
> >
> >
>
> On 06/22/2013 03:30 AM, ian link wrote:
> > Forgive my ignorance, but I don't entirely understand the problem. What
> > does '+' and '-' refer to exactly?
>
> Consider "RANGE 4.5 PRECEDING'.
>
> You need to be able to test whether, for the current row 'b', any given
> row 'a' is within the range (b - 4.5) < a <= b . Not 100% sure about the
> < vs <= boundaries, but that's irrelevant for the example.
>
> To test that, you have to be able to do two things: you have to be able
> to test whether one value is greater than another, and you have to be
> able to add or subtract a constant from one of the values.
>
> Right now, the b-tree access method provides information on the ordering
> operators < <= = > >= <> , which provides half the answer. But these
> don't give any concept of *distance* - you can test ordinality but not
> cardinality.
>
> To implement the "different by 4.5" part, you have to be able to add 4.5
> to one value or subtract it from the other.
>
> The obvious way to do that is to look up the function that implements
> the '+' or '-' operator, and do:
>
> ((OPERATOR(+))(a, 4.5)) > b AND (a <= b)
>
> or
>
> ((OPERATOR(-))(b, 4.5)) < a AND (a <= b);
>
> The problem outlined by Tom in prior discussion about this is that
> PostgreSQL tries really hard not to assume that particular operator
> names mean particular things. Rather than "knowing" that "+" is always
> "an operator that adds two values together; is transitive, symmetric and
> reflexive", PostgreSQL requires that you define an *operator class* that
> names the operator that has those properties.
>
> Or at least, it does for less-than, less-than-or-equals, equals,
> greater-than-or-equals, greater-than, and not-equals as part of the
> b-tree operator class, which *usually* defines these operators as < <= =
> >= > <>, but you could use any operator names you wanted if you really
> liked.
>
> Right now (as far as I know) there's no operator class that lets you
> identify operators for addition and subtraction in a similar way. So
> it's necessary to either add such an operator class (in which case
> support has to be added for it for every type), extend the existing
> b-tree operator class to provide the info, or blindly assume that "+"
> and "-" are always addition and subtraction.
>
> For an example of why such assumptions are a bad idea, consider matrix
> multiplication. Normally, "a * b" = "b * a", but this isn't true for
> multiplication of matrices. Similarly, if someone defined a "+" operator
> as an alias for string concatenation (||), we'd be totally wrong to
> assume we could use that for doing range-offset windowing.
>
> So. Yeah. Operator classes required, unless we're going to change the
> rules and make certain operator names "special" in PostgreSQL, so that
> if you implement them they *must* have certain properties. This seems
> like a pretty poor reason to add such a big change.
>
> I hope this explanation (a) is actually correct and (b) is helpful.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2013-07-01 04:20:59 Re: Eliminating PD_ALL_VISIBLE, take 2
Previous Message Josh Berkus 2013-07-01 03:07:29 Re: New regression test time