From: | Tatsuo Ishii <ishii(at)postgresql(dot)org> |
---|---|
To: | dgrowleyml(at)gmail(dot)com |
Cc: | ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN |
Date: | 2024-09-06 04:21:31 |
Message-ID: | 20240906.132131.1196176059464159407.ishii@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi David,
>> I pushed the changes to WindowAgg so as not to call tuplestore_end()
>> on every partition. Can you rebase this patch over that change?
>>
>> It would be good to do this in a way that does not add any new state
>> to WindowAggState, you can see that I had to shuffle fields around in
>> that struct because the next_parition field would have caused the
>> struct to become larger. I've not looked closely, but I expect this
>> can be done by adding more code to tuplestore_updatemax() to also
>> track the disk space used if the current storage has gone to disk. I
>> expect the maxSpace field can be used for both, but we'd need another
>> bool field to track if the max used was by disk or memory.
I have created a patch in the direction you suggested. See attached
patch (v1-0001-Enhance-tuplestore.txt). To not confuse CFbot, the
extension is "txt", not "patch".
>> I think the performance of this would also need to be tested as it
>> means doing an lseek() on every tuplestore_clear() when we've gone to
>> disk. Probably that will be dominated by all the other overheads of a
>> tuplestore going to disk (i.e. dumptuples() etc), but it would be good
>> to check this. I suggest setting work_mem = 64 and making a test case
>> that only just spills to disk. Maybe do a few thousand partitions
>> worth of that and see if you can measure any slowdown.
I copied your shell script and slightly modified it then ran pgbench
with (1 10 100 1000 5000 10000) window partitions (see attached shell
script). In the script I set work_mem to 64kB. It seems for 10000,
1000 and 100 partitions, the performance difference seems
noises. However, for 10, 2, 1 partitions. I see large performance
degradation with the patched version: patched is slower than stock
master in 1.5% (10 partitions), 41% (2 partitions) and 55.7% (1
partition). See the attached graph.
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/plain | 3.2 KB |
unknown_filename | text/plain | 720 bytes |
image/png | 20.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2024-09-06 04:28:52 | Re: Trying out read streams in pgvector (an extension) |
Previous Message | Amit Langote | 2024-09-06 03:07:49 | Re: pgsql: Add more SQL/JSON constructor functions |