From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Vik Fearing <vik(at)postgresfriends(dot)org>, Georgios Kokolatos <gkokolatos(at)protonmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl> |
Subject: | Re: GROUP BY DISTINCT |
Date: | 2021-03-16 14:52:52 |
Message-ID: | 8c44df08-bb64-3d78-da1a-d24b7c9db39d@enterprisedb.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
On 3/16/21 9:21 AM, Vik Fearing wrote:
> On 3/13/21 12:33 AM, Tomas Vondra wrote:
>> Hi Vik,
>>
>> The patch seems quite ready, I have just two comments.
>
> Thanks for taking a look.
>
>> 1) Shouldn't this add another <indexterm> for DISTINCT, somewhere in the
>> documentation? Now the index points just to the SELECT DISTINCT part.
>
> Good idea; I never think about the index.
>
>> 2) The part in gram.y that wraps/unwraps the boolean flag as an integer,
>> in order to stash it in the group lists is rather ugly, IMHO. It forces
>> all the places handling the list to be aware of this (there are not
>> many, but still ...). And there are no other places doing (bool) intVal
>> so it's not like there's a precedent for this.
>
> There is kind of a precedent for it, I was copying off of TriggerEvents
> and func_alias_clause.
>
I see. I was looking for "(bool) intVal" but you're right TriggerEvents
code does something similar.
>> I think the clean solution is to make group_clause produce a struct with
>> two fields, and just use that. Not sure how invasive that will be
>> outside gram.y, though.
>
> I didn't want to create a whole new parse node for it, but Andrew Gierth
> pointed me towards SelectLimit so I did it like that and I agree it is
> much cleaner.
>
I agree, that's much cleaner.
>> Also, the all_or_distinct vs. distinct_or_all seems a bit error-prone. I
>> wonder if we can come up with some clearer names, describing the context
>> of those types.
>
> I turned this into an enum for ALL/DISTINCT/default and the caller can
> choose what it wants to do with default. I think that's a lot cleaner,
> too. Maybe DISTINCT ON should be changed to fit in that? I left it
> alone for now.
>
I think DISTINCT ON is a different kind of animal, because that is a
list of expressions, not just a simple enum state.
> I also snuck in something that all of us overlooked which is outputting
> the DISTINCT in ruleutils.c. I didn't add a test for it but that would
> have been an unfortunate bug.
>
Oh!
> New patch attached, rebased on 15639d5e8f.
>
Thanks. At this point it seems fine to me, no further comments.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | PG Doc comments form | 2021-03-16 17:56:46 | Duplicate "SELECT current_user;" |
Previous Message | Vik Fearing | 2021-03-16 08:21:03 | Re: GROUP BY DISTINCT |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2021-03-16 15:18:57 | Re: [PATCH]: Allow errors in parameter values to be reported during the BIND phase itself.. |
Previous Message | Ranier Vilela | 2021-03-16 14:52:18 | re: crash during cascaded foreign key update |