Re: Order of columns in GROUP BY is significant to the planner.

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Order of columns in GROUP BY is significant to the planner.
Date: 2017-12-28 04:46:44
Message-ID: CAKJS1f-kJEt=9JmMqGrTJow_si-R4QMytScF1q5W5inF9FL=8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 22 December 2017 at 03:37, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> just the number of combinations to try could end up growing
>> very large
>
> Yeah, I'm pretty doubtful that the potential improvement would be
> worth the extra planner cycles in most cases. Maybe if there are
> just two or three GROUP BY columns, it'd be OK to consider all the
> combinations, but it could get out of hand very quickly.

Thinking a bit more about this, it would be pretty silly to go and try
random combinations of columns or all combinations up to a certain
level. It would be much smarter to look for a btree index that has all
of the GROUP BY columns as leading keys and use that column order
instead. Perhaps it could be just changed to that regardless unless
there's also an ORDER BY in the query. Nothing would need to be
touched if there was only 1 GROUP BY expr, and we probably couldn't do
anything if the GROUP BY contains Vars/Exprs from multiple relations.

Of course, it needs more thought than just the above, but it seems
like an idea that might be workable.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Janes 2017-12-28 15:51:10 Re: Order of columns in GROUP BY is significant to the planner.
Previous Message Tom Lane 2017-12-27 20:43:05 Re: BUG #14952: COPY fails to fill in IDENTITY column default value