From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Order of columns in GROUP BY is significant to the planner. |
Date: | 2017-12-21 06:16:02 |
Message-ID: | CAMkU=1w9scCPMicmhzrAkUUFVnmJ5JKok5K2KD_qNvXYyEQZ+A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
I was recently surprised to discover that the planner finds that the order
of columns in the GROUP BY to be significant. I've found it in 9.6.6. and
verified it is still like that in v11.
To set up the example:
create table foo as select floor(random()*100)::int as col1,
floor(random()*100)::int as col2, floor(random()*10)::int as col3, random()
as col4, random() as col5 from generate_series(1,100000000);
vacuum analyze foo;
create index on foo (col3, col1, col2, col5, col4);
set enable_hashagg TO off;
set max_parallel_workers_per_gather TO 0;
EXPLAIN (BUFFERS, ANALYZE, timing off) SELECT SUM(col5), col2, col1 FROM foo
WHERE col3 = '4' AND col4 <= '0.9' GROUP BY col2,col1;
EXPLAIN (BUFFERS, ANALYZE, timing off) SELECT SUM(col5), col2, col1 FROM foo
WHERE col3 = '4' AND col4 <= '0.9' GROUP BY col1,col2;
The first one inserts a sort node on col1,col2 before doing the Group
Aggregate. The second one uses the ordering of the tuples derived from the
index scan to do the Group Aggregate directly. Isn't it surprising that
the order of the columns in the GROUP BY has to be same as the order in the
index definition in order to make maximal use of the index? Is that a bug?
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2017-12-21 10:13:24 | Re: Order of columns in GROUP BY is significant to the planner. |
Previous Message | Robert Haas | 2017-12-20 22:45:37 | Re: vacuum vs heap_update_tuple() and multixactids |