From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Remove WindowClause PARTITION BY items belonging to redundant pathkeys |
Date: | 2023-06-08 09:11:17 |
Message-ID: | CAMbWs49jxZkfSP-9rG0Go4W7=pXCgS7jcjYJEERGJF3SoOFAqg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 8, 2023 at 7:37 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> What the attached patch does is process each WindowClause and removes
> any items from the PARTITION BY clause that are columns or expressions
> relating to redundant PathKeys.
>
> Effectively, this allows the nodeWindowAgg.c code which stops
> processing WindowAgg rows when the run condition is met to work as the
> PARTITION BY clause is completely removed in the case of the above
> query. Removing the redundant PARTITION BY items also has the added
> benefit of not having to needlessly check if the next row belongs to
> the same partition as the last row. For the above, that check is a
> waste of time as all rows have relkind = 'r'
This is a nice optimization. I reviewed it and here are my findings.
In create_windowagg_plan there is such comment that says
* ... Note: in principle, it's possible
* to drop some of the sort columns, if they were proved redundant by
* pathkey logic. However, it doesn't seem worth going out of our way to
* optimize such cases.
Since this patch removes any clauses from the wc->partitionClause for
redundant pathkeys, this comment seems outdated, at least for the sort
columns in partitionClause.
Also I'm wondering if we can do the same optimization to
wc->orderClause. I tested it with the query below and saw performance
gains.
create table t (a int, b int);
insert into t select 1,2 from generate_series(1,100000)i;
analyze t;
explain analyze
select * from
(select a, b, rank() over (PARTITION BY a order by b) rank
from t where b = 2)
where a = 1 and rank <= 10;
With and without this optimization to wc->orderClause the execution time
is 67.279 ms VS. 119.120 ms (both best of 3).
I notice you comment in the patch that doing this is unsafe because it
would change the semantics of peer rows during execution. Would you
please elaborate on that?
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Joel Jacobson | 2023-06-08 09:41:35 | Re: Do we want a hashset type? |
Previous Message | Daniel Verite | 2023-06-08 09:03:25 | Re: Order changes in PG16 since ICU introduction |