From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Side effect of remove_useless_groupby_columns |
Date: | 2021-02-28 07:52:24 |
Message-ID: | CAMbWs49Mzis1vkya0W=BYkaoS7yYT-+RRm+R7-eaKHSJtDgzoA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
When looking at [1], I realized we may have a side effect when removing
redundant columns in the GROUP BY clause. Suppose we have a query with
ORDER BY 'b', and meanwhile column 'b' is also a group key. If we decide
that 'b' is redundant due to being functionally dependent on other GROUP
BY columns, we would remove it from group keys. This will make us lose
the opportunity to leverage the index on 'b'.
Here is an example for illustration.
# create table t (a int primary key, b int);
# insert into t select i, i%1000 from generate_series(1,1000000)i;
# create index on t(b);
By default, we remove 'b' from group keys and generate a plan as below:
# explain (costs off) select b from t group by a, b order by b limit 10;
QUERY PLAN
------------------------------------------------
Limit
-> Sort
Sort Key: b
-> Group
Group Key: a
-> Index Scan using t_pkey on t
(6 rows)
The index on 'b' is not being used and we'll have to retrieve all the
data underneath to perform the sort work.
On the other hand, if we keep 'b' as a group column, we can get such a
plan as:
# explain (costs off) select b from t group by a, b order by b limit 10;
QUERY PLAN
-------------------------------------------------
Limit
-> Group
Group Key: b, a
-> Incremental Sort
Sort Key: b, a
Presorted Key: b
-> Index Scan using t_b_idx on t
(7 rows)
With the help of 't_b_idx', we can avoid the full scan on 't' and it
would run much faster.
Any thoughts?
[1]
https://www.postgresql.org/message-id/flat/16869-26346b77d6ccaeec%40postgresql.org
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2021-02-28 09:14:59 | Re: Side effect of remove_useless_groupby_columns |
Previous Message | japin | 2021-02-28 06:36:01 | Re: NOT VALID for Unique Indexes |