Re: POC: GROUP BY optimization

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-06-04 18:05:25
Message-ID: CAPpHfduXqPWOr1e6MVTFYqr=17m6g38uBsTaWQcWhRKq03fSQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 4, 2024 at 1:47 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> On Mon, Jun 3, 2024 at 6:55 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> > On Sun, 2 Jun 2024 at 17:18, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> >> On Sun, Jun 2, 2024 at 10:55 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> >> >
> >> > On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> >> > >
> >> > > I've revised some grammar including the sentence you've proposed.
> >> > >
> >> >
> >> > -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
> >> > +static List *preprocess_groupclause(PlannerInfo *root, List *force);
> >> >
> >> > changing preprocess_groupclause the second argument
> >> > from "force" to "gset" would be more intuitive, I think.
> >>
> >> Probably, but my intention is to restore preprocess_groupclause() as
> >> it was before 0452b461bc with minimal edits to support incremental
> >> sort. I'd rather avoid refactoring if this area for now.
> >>
> >> > `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
> >> > sort order");`
> >> >
> >> > I think this error message makes people wonder what "incoming sort order" is.
> >> > BTW, "correspond", generally people use "correspond to".
> >>
> >> Thank you. On the second thought, I think it would be better to turn
> >> this into an assertion like the checks before.
> >>
> >> > I did some minor cosmetic changes, mainly changing foreach to foreach_node.
> >> > Please check the attachment.
> >>
> >> I would avoid refactoring of preprocess_groupclause() for the reason
> >> described above. But I picked the grammar fix for PlannerInfo's
> >> comment.
> >
> >
> > Thank you for working on this patchset.
> >
> > 0001: Patch looks right
> >
> > Comments:
> >
> > Covert -> Convert
> > sets the uninitialized value of ec_sortref of the pathkey's EquivalenceClass -> sets the value of the pathkey's EquivalenceClass unless it's already initialized
> > wasn't set yet -> hasn't been set yet
> >
> > 0002: additional assert checking only. Looks right.
> >
> > 0003: pure renaming, looks good.
> >
> > 0004: Restores pre 0452b461bc state to preprocess_groupclause with removed partial_match fallback. Looks right. I haven't checked the comments provided they are restored from pre 0452b461bc state.
> >
> > 0005: Looks right.
>
> Thank you. Revised according to your comments. I think 0001-0004
> comprising a good refactoring addressing concerns from Tom [1].
> 0001 Removes from get_eclass_for_sort_expr() unnatural responsibility
> of setting EquivalenceClass.ec_sortref. Instead this field is set in
> make_pathkeys_for_sortclauses_extended() while processing of group
> clauses.
> 0002 Provides additional asserts. That should helping to defend
> against lurking bugs.
> 0003 Fixes unsuitable name of data structure.
> 0004 Restores pre 0452b461bc state to preprocess_groupclause() and
> lowers the number of paths to consider.
> It would be good to explicitly hear Tom about this. But I'm quite
> sure these patches makes situation better not worse. I'm going to
> push them if no objections.
>
> I'm putting 0005 aside. It's unclear why and how there could be
> duplicate SortGroupClauses at that stage. Also, it's unclear why
> existing code handles them bad. So, this should wait a comment from
> Tom.
>
> Links.
> 1. https://www.postgresql.org/message-id/266850.1712879082%40sss.pgh.pa.us

Ops... I found I've published an old version of patchset. The
relevant version is attached.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v6-0002-Add-invariants-check-to-get_useful_group_keys_ord.patch application/octet-stream 1.8 KB
v6-0004-Restore-preprocess_groupclause.patch application/octet-stream 13.2 KB
v6-0003-Rename-PathKeyInfo-to-GroupByOrdering.patch application/octet-stream 7.1 KB
v6-0005-Teach-group_keys_reorder_by_pathkeys-about-redund.patch application/octet-stream 3.3 KB
v6-0001-Fix-asymmetry-in-setting-EquivalenceClass.ec_sort.patch application/octet-stream 8.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Wienhold 2024-06-04 18:08:05 Re: plpgsql: fix parsing of integer range with underscores
Previous Message Robert Haas 2024-06-04 17:40:23 Re: meson "experimental"?