From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Zhang Mingli <zmlpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Remove useless GROUP BY columns considering unique index |
Date: | 2024-11-28 04:39:26 |
Message-ID: | CACJufxE-MzvN0qba1KriQzj5OXJBB32mYdfL4YK4KiJVwi+ofA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Nov 27, 2024 at 5:30 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 27 Nov 2024 at 19:51, jian he <jian(dot)universality(at)gmail(dot)com> wrote:
> > v3 attached. in v3:
>
> 1. I find the following quite strange:
>
> postgres=# create table abcd (a int primary key, b int not null, c int
> not null, d int not null, unique(b));
> CREATE TABLE
> postgres=# explain select b,c from abcd group by b,c;
> QUERY PLAN
> --------------------------------------------------------------
> HashAggregate (cost=37.75..56.25 rows=1850 width=8)
> Group Key: b, c
> -> Seq Scan on abcd (cost=0.00..28.50 rows=1850 width=8)
> (3 rows)
>
>
> postgres=# alter table abcd drop constraint abcd_pkey;
> ALTER TABLE
> postgres=# explain select b,c from abcd group by b,c;
> QUERY PLAN
> --------------------------------------------------------------
> HashAggregate (cost=33.12..51.62 rows=1850 width=8)
> Group Key: b
> -> Seq Scan on abcd (cost=0.00..28.50 rows=1850 width=8)
> (3 rows)
>
> Why does the patch only try to remove GROUP BY columns using unique
> indexes when there's no primary key?
>
fixed. aslo have tests on it.
> I don't really see why primary key should be treated specially. Why
> does the patch not just unify the logic and collect up all unique
> non-expression index, non-partial indexes where all columns are NOT
> NULL. You could maybe just add a special case to skip the NOT NULL
> checking for indexes with indisprimary set.
>
now no need to scan pg_constrtraint, just one scan pg_index, collect info.
i aslo removed get_primary_key_attnos.
> 2. In get_unique_not_null_attnos(), not_null_attnos could be a
> Bitmapset with attnums offset by FirstLowInvalidHeapAttributeNumber.
> That'll allow you to do a bms_is_member() call rather than a (possibly
> expensive) list_member_int() call.
fixed
> 3. The logic in remove_useless_groupby_columns() looks much more
> complex than it needs to be. It would be much more simple to find the
> matching unique index with the least number of columns and use that
> one. Just have a counter to record the bms_num_members() of the
> columns of the best match so far and replace it when you find an index
> with fewer columns. Please see adjust_group_pathkeys_for_groupagg()
> for inspiration.
>
now get_unique_not_null_attnos is:
List * get_unique_not_null_attnos(Oid relid, int *pk_pos)
returned pk_pos is pk attnums index (zero based) within the returned list.
current pk_pos no use, but i guess it should be useful.
so we can quickly retrieve pk attnums.
> We also need to think about if the unique index with the least columns
> is always the best one to use. Perhaps the index with the least
> columns is a varlena column and there's some index with, say, two
> INT4s. It would be much cheaper to hash a couple of INT4s than some
> long varlena column. Maybe it's ok just to leave an XXX comment
> explaining that we might want to think about doing that one day.
> Alternatively, summing up the pg_statistic.stawidth values could work.
> Likely it's better to make the patch work correctly before thinking
> about that part.
>
this part has not been addressed yet.
v4 logic is to choose one with the least number of columns.
if there is more than one with the least number of columns, simply
choose the first one
in the matched list.
i think i addressed most of your concern, now the code seems pretty neat.
Attachment | Content-Type | Size |
---|---|---|
v4-0001-remove-useless-group-by-columns-via-unique-not-nu.patch | text/x-patch | 14.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2024-11-28 05:09:46 | Re: CI CompilerWarnings test fails on 15 in mingw_cross_warning |
Previous Message | Srirama Kucherlapati | 2024-11-28 03:56:49 | RE: AIX support |