Re: Optimize WindowAgg's use of tuplestores

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: David Rowley <dgrowley(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Optimize WindowAgg's use of tuplestores
Date: 2024-09-04 14:49:51
Message-ID: CAApHDvqPgFtwme2Zyf75BpMLwYr2mnUstDyPiP=EpudYuQTPPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 19 Aug 2024 at 22:01, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> To try and move this forward again, I adjusted the patch to use a
> static function with pg_noinline rather than unlikely. I don't think
> this will make much difference code generation wise, but I did think
> it was an improvement in code cleanliness. Patches attached.
>
> I did a round of benchmarking on an AMD Zen4 7945hx and on an Apple
> M2. I also graphed the results you sent so they're easier to compare
> with mine.
>
> 0001 is effectively the unlikely() patch for calculating the frame offsets.
> 0002 is the tuplestore_reset() patch

I was experimenting with this again. The 0002 patch added a
next_partition field to the WindowAggState struct and caused the
struct to become slightly bigger. I've now included a 0003 patch
which shifts some fields around in that struct so as to keep it the
same size as it is on master. Benchmarking with that removes that very
tiny performance regression. Please see the attached CSV file for the
results. The percentage row compares master to all patches. I also
tested this on an AMD 3990x machine along with fresh results from the
AMD 7945hx laptop. Both of those machines come out faster on all tests
when comparing master to all 3 patches. With the Apple M2, there does
not seem to be much change in performance with the tests containing
fewer rows per partition, some are faster, some are slower, all within
typical noise fluctuations.

Given the performance now seems improved in all cases, I plan on
pushing this change as a single commit.

David

Attachment Content-Type Size
v4-0001-Speedup-WindowAgg-code-by-moving-uncommon-code-ou.patch application/octet-stream 5.4 KB
v4-0002-Optimize-WindowAgg-s-use-of-tuplestores.patch application/octet-stream 8.4 KB
v4-0003-Experiment-with-WindowAggState-fields.patch application/octet-stream 2.3 KB
performance_results.csv text/csv 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrei Lepikhov 2024-09-04 14:50:26 Re: using extended statistics to improve join estimates
Previous Message Guillaume Lelarge 2024-09-04 14:37:19 Re: Add parallel columns for seq scan and index scan on pg_stat_all_tables and _indexes