From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: GROUP BY optimization |
Date: | 2018-06-16 15:59:24 |
Message-ID: | 623f5e50-eeb2-1279-d30b-dc43611fb063@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/13/2018 06:56 PM, Teodor Sigaev wrote:
>> I.e. we already do reorder the group clauses to match ORDER BY, to only
>> require a single sort. This happens in preprocess_groupclause(), which
>> also explains the reasoning behind that.
> Huh. I missed that. That means group_keys_reorder_by_pathkeys() call
> inside get_cheapest_group_keys_order() duplicates work of
> preprocess_groupclause().
> And so it's possible to replace it to simple counting number of the
> same pathkeys in beginning of lists. Or remove most part of
> preprocess_groupclause() - but it seems to me we should use first
> option, preprocess_groupclause() is simpler, it doesn't operate with
> pathkeys only with SortGroupClause which is simpler.
>
> BTW, incremental sort path provides pathkeys_common(), exactly what we
> need.
>
>> I wonder if some of the new code reordering group pathkeys could/should
>> be moved here (not sure, maybe it's too early for those decisions). In
>> any case, it might be appropriate to update some of the comments before
>> preprocess_groupclause() which claim we don't do certain things added by
>> the proposed patches.
>
> preprocess_groupclause() is too early step to use something like
> group_keys_reorder_by_pathkeys() because pathkeys is not known here yet.
>
>
>> This probably also somewhat refutes my claim that the order of
>> grouping keys is currently fully determined by users (and so they
>> may pick the most efficient order), while the reorder-by-ndistinct
>> patch would make that impossible. Apparently when there's ORDER BY,
>> we already mess with the order of group clauses - there are ways to
>> get around it (subquery with OFFSET 0) but it's much less clear.
>
> I like a moment when objections go away :)
>
I'm afraid that's a misunderstanding - my objections did not really go
away. I'm just saying my claim that we're not messing with order of
grouping keys is not entirely accurate, because we do that in one
particular case.
I still think we need to be careful when introducing new optimizations
in this area - reordering the grouping keys by ndistinct, ORDER BY or
whatever. In particular I don't think we should commit these patches
that may quite easily cause regressions, and then hope some hypothetical
future patch fixes the costing. Those objections still stand.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-06-16 17:29:55 | Re: GCC 8 warnings |
Previous Message | Tom Lane | 2018-06-16 14:42:26 | Re: row_to_json(), NULL values, and AS |