From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Final Patch for GROUPING SETS - unrecognized node type: 347 |
Date: | 2014-09-07 18:27:07 |
Message-ID: | 540CA37B.2080308@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7.9.2014 18:52, Andrew Gierth wrote:
>>>>>> "Tomas" == Tomas Vondra <tv(at)fuzzy(dot)cz> writes:
>
> Tomas> Maybe preventing this completely (i.e. raising an ERROR with
> Tomas> "duplicate columns in CUBE/ROLLUP/... clauses") would be
> Tomas> appropriate. Does the standard says anything about this?
>
> The spec does not say anything explicitly about duplicates, so they
> are allowed (and duplicate grouping _sets_ can't be removed, only
> duplicate columns within a single GROUP BY clause after the grouping
> sets have been eliminated by transformation). I have checked my
> reading of the spec against oracle 11 and MSSQL using sqlfiddle.
>
> The way the spec handles grouping sets is to define a sequence of
> syntactic transforms that result in a query which is a UNION ALL of
> ordinary GROUP BY queries. (We haven't tried to implement the
> additional optional feature of GROUP BY DISTINCT.) Since it's UNION
> ALL, any duplicates must be preserved, so a query with GROUPING SETS
> ((a),(a)) reduces to:
>
> SELECT ... GROUP BY a UNION ALL SELECT ... GROUP BY a;
>
> and therefore has duplicates of all its result rows.
>
> I'm quite prepared to concede that I may have read the spec wrong
> (wouldn't be the first time), but in this case I require any such
> claim to be backed up by an example from some other db showing an
> actual difference in behavior.
I think you read the spec right. Apparently duplicate grouping sets are
allowed, and it's supposed to output that grouping set twice.
The section on ROLLUP/CUBE do not mention duplicates at all, it only
explains how to generate all the possible grouping sets, so if you have
duplicate columns there, you'll get duplicate sets (which is allowed).
If we can get rid of the excessive ChainAggregate, that's certainly
enough for now.
Optimizing it could be simple, though - you don't need to keep the
duplicate groups, you only need to keep a counter "how many times to
output this group". But the more I think about it, the more I think we
can ignore that. There are far more important pieces to implement, and
if you write bad SQL there's no help anyway.
regards
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2014-09-07 18:45:38 | Re: Built-in binning functions |
Previous Message | Tom Lane | 2014-09-07 18:07:04 | Re: Patch for psql History Display on MacOSX |