Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement

From: "David D(dot) Kilzer" <ddkilzer(at)lubricants-oil(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Using ORDER BY with AGGREGATE/GROUP BY in a SELECT statement
Date: 2001-05-14 14:42:15
Message-ID: 20010514094215.C17879@elbonia.lubricants-oil.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Tom,

Thanks for the reply, and for correcting my broken SQL. :^)

A couple days after I sent my own follow-up message, I realized how
silly my question was. I'll let future questions sit a bit longer
next time.

To make up for my silly question, here is a function + aggregate I
created while working on a different aspect of my original problem.
Perhaps someone will find this useful.

The function, 'booland', returns the logical 'AND' of two input values.

DROP FUNCTION "booland" (bool, bool);

CREATE FUNCTION "booland" (bool, bool)
RETURNS bool
AS 'BEGIN RETURN $1 AND $2; END;'
LANGUAGE 'PLPGSQL';

The aggregate, 'logical_and', returns the logical 'AND' of all values in
a column in an aggregated query.

DROP AGGREGATE logical_and bool;

CREATE AGGREGATE logical_and (
SFUNC1 = booland,
BASETYPE = bool,
STYPE1 = bool,
INITCOND1 = 't'
);

Obviously, this concept could be extended for a logical 'OR' function
and aggregate.

Dave

On Fri, May 11, 2001 at 12:33:52PM -0400, Tom Lane wrote:

> "David D. Kilzer" <ddkilzer(at)lubricants-oil(dot)com> writes:
> > [ wants to write an aggregate function that returns its last input ]
>
> The SQL model of query processing has a very definite view of the stages
> of processing: first group by, then aggregate, and last order by. Tuple
> ordering is irrelevant according to the basic semantics of the language.
> Probably the SQL authors would have left out ORDER BY entirely if they
> could have got away with it, but instead they made it a vestigial
> appendage that is only allowed at the very last instant before query
> outputs are forwarded to a client application.
>
> Thus, it is very bad form to write an aggregate that depends on the
> order it sees its inputs in. This won't be changed, because it's part
> of the nature of the language.
>
> In PG 7.1 it's possible to hack around this by ordering the result of
> a subselect-in-FROM:
>
> SELECT orderedagg(ss.x) FROM (select x from tab order by y) ss;
>
> which is a gross violation of the letter and spirit of the spec, and
> should not be expected to be portable to other DBMSes; but it gets the
> job done if you are intent on writing an ordering-dependent aggregate.
>
> However, I don't see any good way to combine this with grouping, since
> if you apply GROUP BY to the output of the subselect you'll lose the
> ordering again.
>
> > SELECT r.personid AS personid
> > ,SUM(r.laps) AS laps
> > ,COUNT(DISTINCT r.id) AS nightsraced
> > ,(SELECT r.carid
> > FROM race r
> > WHERE r.personid = 14
> > ORDER BY r.date DESC
> > LIMIT 1) AS carid
> > FROM race r
> > WHERE r.personid = 14
> > GROUP BY r.personid
> > ORDER BY r.date;
>
> This is likely to be reasonably efficient, actually, since the subselect
> will be evaluated only once per output group --- in fact, as you've
> written it it'll only be evaluated once, period, since it has no
> dependencies on the outer query. More usually you'd probably do
>
> ,(SELECT r2.carid
> FROM race r2
> WHERE r2.personid = r.personid
> ORDER BY r2.date DESC
> LIMIT 1) AS carid
>
> so that the result tracks the outer query, and in this form it'd be
> redone once per output row.
>
> regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Martín Marqués 2001-05-14 15:15:45 Re: Auto incrementing an integer
Previous Message Ligia Pimentel 2001-05-14 14:28:18 fatal ERROR running simple join query...