Re: Error message: standard SQL or PostgreQSL?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Huub <v(dot)niekerk(at)freeler(dot)nl>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Error message: standard SQL or PostgreQSL?
Date: 2002-11-15 15:20:53
Message-ID: 27486.1037373653@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Huub <v(dot)niekerk(at)freeler(dot)nl> writes:
> When I copied a function call into an SQL-statement and executed it, I
> got the message 'Aggregate function calls may not be nested'. Does that
> mean this is the case in standard SQL or is this specifically PostgreSQL?

It's in the standard: SQL92 (which calls aggregates "set functions")
says

4) The <value expression> simply contained in <set function spec-
ification> shall not contain a <set function specification> or
a <subquery>.

I don't see a reason for the <subquery> prohibition (and AFAIK that case
will work in Postgres), but the rule against nested aggregate calls is
simply common sense when you think about it. What would it mean to nest
two aggregate functions? For example, in

SELECT COUNT(a) + 1, SUM(b + c) FROM ...

for each row produced by the FROM-clause, we check the null-ness of "a"
(and increment the count or not), and we form b+c and add it to the
running sum. At the end of the input we output the result row. Notice
that "b+c" is evaluated at each input row but the +1 is only done once,
because one is inside and the other is outside the aggregate calls.

Now, what would

SELECT SUM(a + SUM(b + c)) FROM ...

actually mean? If the inner SUM() ranges over all the rows, what value
of "a" is supposed to be added to it? If it doesn't range over all the
rows, then what's it doing? How would you control what set of rows each
SUM() is ranging over?

So in general it makes no sense to nest aggregates in a single SELECT
list. There are cases where you want to do something that amounts to
nesting aggregates, but you always have to write multiple levels of
SELECT in order to express the behavior you want. For example, suppose
I want to know the average total payroll of all company departments:

SELECT AVG(deptpay) FROM
(SELECT SUM(pay) AS deptpay FROM employees GROUP BY dept) ss;

Here, the inner SELECT produces a row for each department, with the
SUM() ranging over all rows having that department ID; then the outer
SELECT's AVG() ranges over the per-department rows.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Stephan Szabo 2002-11-15 15:42:24 Re: Error message: standard SQL or PostgreQSL?
Previous Message teknokrat 2002-11-15 11:45:39 Updating a table column with ref integrity