From: | Hannu Krosing <hannuk(at)google(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Michael Lewis <mlewis(at)entrata(dot)com>, "Liu, Xinyu" <liuxy(at)gatech(dot)edu>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Potential performance issues related to group by and covering index |
Date: | 2021-03-05 01:14:40 |
Message-ID: | CAMT0RQQH0YjgO3fEOn6GMx5hvezYS97+kv59bGb=xCCs5DMtvw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
In the original example it looks like using the index (and not running
a parallel query) is what made the query slow
The fast version was brute-force sequscan(s) + sort with 3 parallel
backends (leader + 2 workers) sharing the work.
On Tue, Mar 2, 2021 at 10:42 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
> On Wed, 3 Mar 2021 at 10:04, Michael Lewis <mlewis(at)entrata(dot)com> wrote:
> > Are there guidelines or principles you could share about writing the group by clause such that it is more efficient?
>
> If you have the option of writing them in the same order as an
> existing btree index that covers the entire GROUP BY clause (in
> version < PG13) or at least prefix of the GROUP BY clause (version >=
> PG13), then the planner has a chance to make use of that index to
> provide pre-sorted input to do group aggregate.
>
> Since PG13 has Incremental Sort, having an index that covers only a
> prefix of the GROUP BY clause may still help.
>
> If no indexes exist then you might get better performance by putting
> the most distinct column first. That's because sorts don't need to
> compare the remaining columns once it receives two different values
> for one column. That gets more complex when the most distinct column
> is wider than the others. e.g a text compare is more expensive than
> comparing two ints. For Hash Aggregate, I don't think the order will
> matter much.
>
> David
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | val.janeiko | 2021-03-05 17:55:37 | Slow query performance inside a transaction on a clean database |
Previous Message | Justin Pryzby | 2021-03-03 04:20:23 | Re: tables meta data collection |