Re: POC: GROUP BY optimization

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Richard Guo <guofenglinux(at)gmail(dot)com>, Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, David Rowley <dgrowleyml(at)gmail(dot)com>, "a(dot)rybakina" <a(dot)rybakina(at)postgrespro(dot)ru>
Subject: Re: POC: GROUP BY optimization
Date: 2024-05-31 00:12:28
Message-ID: CAPpHfdtAOT8ZN8FjeP2eBRrCnW8BKb0V0PFzfSuChdNVk84QQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Thu, May 30, 2024 at 7:22 AM Andrei Lepikhov
<a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> On 5/29/24 19:53, Alexander Korotkov wrote:
> > Thank you for your feedback.
> >
> > On Wed, May 29, 2024 at 11:08 AM Andrei Lepikhov
> > <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> >> On 5/27/24 19:41, Alexander Korotkov wrote:
> >>> Any thoughts?
> >> About 0001:
> >> Having overviewed it, I don't see any issues (but I'm the author),
> >> except grammatical ones - but I'm not a native to judge it.
> >> Also, the sentence 'turning GROUP BY clauses into pathkeys' is unclear
> >> to me. It may be better to write something like: 'building pathkeys by
> >> the list of grouping clauses'.
> >
> > OK, thank you. I'll run once again for the grammar issues.

I've revised some grammar including the sentence you've proposed.

> >> 0002:
> >> The part under USE_ASSERT_CHECKING looks good to me. But the code in
> >> group_keys_reorder_by_pathkeys looks suspicious: of course, we do some
> >> doubtful work without any possible way to reproduce, but if we envision
> >> some duplicated elements in the group_clauses, we should avoid usage of
> >> the list_concat_unique_ptr.
> >
> > As I understand Tom, there is a risk that clauses list may contain
> > multiple instances of equivalent SortGroupClause, not duplicate
> > pointers.
> Maybe. I just implemented the worst-case scenario with the intention of
> maximum safety. So, it's up to you.
> >
> >> What's more, why do you not exit from
> >> foreach_ptr immediately after SortGroupClause has been found? I think
> >> the new_group_clauses should be consistent with the new_group_pathkeys.
> >
> > I wanted this to be consistent with preprocess_groupclause(), where
> > duplicate SortGroupClause'es are grouped together. Otherwise, we
> > could just delete redundant SortGroupClause'es.
> Hm, preprocess_groupclause is called before the standard_qp_callback
> forms the 'canonical form' of group_pathkeys and such behaviour needed.
> But the code that chooses the reordering strategy uses already processed
> grouping clauses, where we don't have duplicates in the first
> num_groupby_pathkeys elements, do we?

Yep, it seems that we work with group clauses which were processed by
standard_qp_callback(). In turn standard_qp_callback() called
make_pathkeys_for_sortclauses_extended() with remove_redundant ==
true. So, there shouldn't be duplicates. And it's unclear what
should we be protected from.

I leaved 0002 with just asserts. And extracted questionable changes into 0005.

------
Regards,
Alexander Korotkov
Supabase

Attachment Content-Type Size
v3-0004-Restore-preprocess_groupclause.patch application/octet-stream 13.1 KB
v3-0005-Teach-group_keys_reorder_by_pathkeys-about-redund.patch application/octet-stream 3.2 KB
v3-0002-Add-invariants-check-to-get_useful_group_keys_ord.patch application/octet-stream 1.8 KB
v3-0003-Rename-PathKeyInfo-to-GroupByOrdering.patch application/octet-stream 7.0 KB
v3-0001-Fix-asymmetry-in-setting-EquivalenceClass.ec_sort.patch application/octet-stream 8.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-05-31 00:35:58 Re: Add memory context type to pg_backend_memory_contexts view
Previous Message David Christensen 2024-05-30 20:49:07 Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)