Re: Remove useless GROUP BY columns considering unique index

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Zhang Mingli <zmlpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Remove useless GROUP BY columns considering unique index
Date: 2024-11-28 06:11:18
Message-ID: CAApHDvp6xkkq7YCGk94jGguxSdq7ewD7VDn3-i-AyJdyhK-Etw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 28 Nov 2024 at 17:39, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> v4 logic is to choose one with the least number of columns.
> if there is more than one with the least number of columns, simply
> choose the first one
> in the matched list.

The new code inside remove_useless_groupby_columns() is still more
complex than what's needed. There's no need to build a 2nd list with
the matching indexes and loop over it. Just skip the non-matching
indexes and record the best match.

I imagined it would look a bit more like the attached patch. I also
removed the pk_pos column as I didn't see any point in it. You were
not using it. If some future code needs that, it can be added then. I
also fixed up a few comments that needed attention. I didn't look at
the regression tests being added. I think given that there's now no
special case for primary key indexes that we don't need a completely
new set of tests. I think it'll make more sense to adjust some of the
existing tests to use a unique constraint instead of a PK and then
adjust a column's NOT NULL property to check that part of the code is
working correctly.

Another issue with this is that the list of indexes being used is not
sorted by Oid. If you look at RelationGetIndexList() you'll see that
we perform a sort. That gives us more consistency in the planner. I
think this patch should be doing that too, otherwise, you could end up
with a plan change after some operation that changes the order that
the indexes are stored in the pg_index table. It's probably fairly
unlikely, but it is the sort of issue that someone will eventually
discover and report.

David

Attachment Content-Type Size
v5-0001-remove-useless-group-by-columns-via-unique-not-nu.patch application/octet-stream 14.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2024-11-28 06:13:08 Re: Use streaming read API in pgstattuple.
Previous Message Tom Lane 2024-11-28 05:52:54 Re: CREATE SCHEMA ... CREATE DOMAIN support