From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | David Rowley <dgrowley(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Optimize WindowAgg's use of tuplestores |
Date: | 2024-09-04 14:49:51 |
Message-ID: | CAApHDvqPgFtwme2Zyf75BpMLwYr2mnUstDyPiP=EpudYuQTPPQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, 19 Aug 2024 at 22:01, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> To try and move this forward again, I adjusted the patch to use a
> static function with pg_noinline rather than unlikely. I don't think
> this will make much difference code generation wise, but I did think
> it was an improvement in code cleanliness. Patches attached.
>
> I did a round of benchmarking on an AMD Zen4 7945hx and on an Apple
> M2. I also graphed the results you sent so they're easier to compare
> with mine.
>
> 0001 is effectively the unlikely() patch for calculating the frame offsets.
> 0002 is the tuplestore_reset() patch
I was experimenting with this again. The 0002 patch added a
next_partition field to the WindowAggState struct and caused the
struct to become slightly bigger. I've now included a 0003 patch
which shifts some fields around in that struct so as to keep it the
same size as it is on master. Benchmarking with that removes that very
tiny performance regression. Please see the attached CSV file for the
results. The percentage row compares master to all patches. I also
tested this on an AMD 3990x machine along with fresh results from the
AMD 7945hx laptop. Both of those machines come out faster on all tests
when comparing master to all 3 patches. With the Apple M2, there does
not seem to be much change in performance with the tests containing
fewer rows per partition, some are faster, some are slower, all within
typical noise fluctuations.
Given the performance now seems improved in all cases, I plan on
pushing this change as a single commit.
David
Attachment | Content-Type | Size |
---|---|---|
v4-0001-Speedup-WindowAgg-code-by-moving-uncommon-code-ou.patch | application/octet-stream | 5.4 KB |
v4-0002-Optimize-WindowAgg-s-use-of-tuplestores.patch | application/octet-stream | 8.4 KB |
v4-0003-Experiment-with-WindowAggState-fields.patch | application/octet-stream | 2.3 KB |
performance_results.csv | text/csv | 1.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrei Lepikhov | 2024-09-04 14:50:26 | Re: using extended statistics to improve join estimates |
Previous Message | Guillaume Lelarge | 2024-09-04 14:37:19 | Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes |