Re: POC: GROUP BY optimization

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-06-03 22:47:43
Message-ID: CAPpHfdvS6Cnzz44_LES+z+erpch2yL8_m1MJ2aYLYhvGq6Vh2A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 3, 2024 at 6:55 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
> On Sun, 2 Jun 2024 at 17:18, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>>
>> Hi!
>>
>> On Sun, Jun 2, 2024 at 10:55 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>> >
>> > On Fri, May 31, 2024 at 8:12 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>> > >
>> > > I've revised some grammar including the sentence you've proposed.
>> > >
>> >
>> > -static List *groupclause_apply_groupingset(PlannerInfo *root, List *force);
>> > +static List *preprocess_groupclause(PlannerInfo *root, List *force);
>> >
>> > changing preprocess_groupclause the second argument
>> > from "force" to "gset" would be more intuitive, I think.
>>
>> Probably, but my intention is to restore preprocess_groupclause() as
>> it was before 0452b461bc with minimal edits to support incremental
>> sort. I'd rather avoid refactoring if this area for now.
>>
>> > `elog(ERROR, "Order of group-by clauses doesn't correspond incoming
>> > sort order");`
>> >
>> > I think this error message makes people wonder what "incoming sort order" is.
>> > BTW, "correspond", generally people use "correspond to".
>>
>> Thank you. On the second thought, I think it would be better to turn
>> this into an assertion like the checks before.
>>
>> > I did some minor cosmetic changes, mainly changing foreach to foreach_node.
>> > Please check the attachment.
>>
>> I would avoid refactoring of preprocess_groupclause() for the reason
>> described above. But I picked the grammar fix for PlannerInfo's
>> comment.
>
>
> Thank you for working on this patchset.
>
> 0001: Patch looks right
>
> Comments:
>
> Covert -> Convert
> sets the uninitialized value of ec_sortref of the pathkey's EquivalenceClass -> sets the value of the pathkey's EquivalenceClass unless it's already initialized
> wasn't set yet -> hasn't been set yet
>
> 0002: additional assert checking only. Looks right.
>
> 0003: pure renaming, looks good.
>
> 0004: Restores pre 0452b461bc state to preprocess_groupclause with removed partial_match fallback. Looks right. I haven't checked the comments provided they are restored from pre 0452b461bc state.
>
> 0005: Looks right.

Thank you. Revised according to your comments. I think 0001-0004
comprising a good refactoring addressing concerns from Tom [1].
0001 Removes from get_eclass_for_sort_expr() unnatural responsibility
of setting EquivalenceClass.ec_sortref. Instead this field is set in
make_pathkeys_for_sortclauses_extended() while processing of group
clauses.
0002 Provides additional asserts. That should helping to defend
against lurking bugs.
0003 Fixes unsuitable name of data structure.
0004 Restores pre 0452b461bc state to preprocess_groupclause() and
lowers the number of paths to consider.
It would be good to explicitly hear Tom about this. But I'm quite
sure these patches makes situation better not worse. I'm going to
push them if no objections.

I'm putting 0005 aside. It's unclear why and how there could be
duplicate SortGroupClauses at that stage. Also, it's unclear why
existing code handles them bad. So, this should wait a comment from
Tom.

Links.
1. https://www.postgresql.org/message-id/266850.1712879082%40sss.pgh.pa.us

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v5-0001-Fix-asymmetry-in-setting-EquivalenceClass.ec_sort.patch application/octet-stream 8.8 KB
v5-0002-Add-invariants-check-to-get_useful_group_keys_ord.patch application/octet-stream 1.7 KB
v5-0004-Restore-preprocess_groupclause.patch application/octet-stream 13.1 KB
v5-0003-Rename-PathKeyInfo-to-GroupByOrdering.patch application/octet-stream 7.0 KB
v5-0005-Teach-group_keys_reorder_by_pathkeys-about-redund.patch application/octet-stream 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-06-03 22:49:52 Re: ResourceOwner refactoring
Previous Message Michael Paquier 2024-06-03 22:43:41 Re: Fix an incorrect assertion condition in mdwritev().