| From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
|---|---|
| To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
| Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Remove WindowClause PARTITION BY items belonging to redundant pathkeys |
| Date: | 2023-06-08 09:11:17 |
| Message-ID: | CAMbWs49jxZkfSP-9rG0Go4W7=pXCgS7jcjYJEERGJF3SoOFAqg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Jun 8, 2023 at 7:37 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> What the attached patch does is process each WindowClause and removes
> any items from the PARTITION BY clause that are columns or expressions
> relating to redundant PathKeys.
>
> Effectively, this allows the nodeWindowAgg.c code which stops
> processing WindowAgg rows when the run condition is met to work as the
> PARTITION BY clause is completely removed in the case of the above
> query. Removing the redundant PARTITION BY items also has the added
> benefit of not having to needlessly check if the next row belongs to
> the same partition as the last row. For the above, that check is a
> waste of time as all rows have relkind = 'r'
This is a nice optimization. I reviewed it and here are my findings.
In create_windowagg_plan there is such comment that says
* ... Note: in principle, it's possible
* to drop some of the sort columns, if they were proved redundant by
* pathkey logic. However, it doesn't seem worth going out of our way to
* optimize such cases.
Since this patch removes any clauses from the wc->partitionClause for
redundant pathkeys, this comment seems outdated, at least for the sort
columns in partitionClause.
Also I'm wondering if we can do the same optimization to
wc->orderClause. I tested it with the query below and saw performance
gains.
create table t (a int, b int);
insert into t select 1,2 from generate_series(1,100000)i;
analyze t;
explain analyze
select * from
(select a, b, rank() over (PARTITION BY a order by b) rank
from t where b = 2)
where a = 1 and rank <= 10;
With and without this optimization to wc->orderClause the execution time
is 67.279 ms VS. 119.120 ms (both best of 3).
I notice you comment in the patch that doing this is unsafe because it
would change the semantics of peer rows during execution. Would you
please elaborate on that?
Thanks
Richard
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Joel Jacobson | 2023-06-08 09:41:35 | Re: Do we want a hashset type? |
| Previous Message | Daniel Verite | 2023-06-08 09:03:25 | Re: Order changes in PG16 since ICU introduction |