From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add ExprState hashing for GROUP BY and hashed SubPlans |
Date: | 2024-10-31 02:30:06 |
Message-ID: | 6201c452-3506-4389-918c-a65568557e3f@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/31/24 08:16, David Rowley wrote:
> On Tue, 29 Oct 2024 at 22:47, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>> I've attached an updated patch with a few other fixes. Whilr checking
>> this tonight, noticed that master does not use
>> SubPlanState.tab_eq_funcs for anything. I resisted removing that in
>> this patch. Perhaps a follow-on patch can remove that. I suspect it's
>> not been used for a long time now, but I didn't do the archaeology
>> work to find out.
>
> 3974bc319 removed the SubPlanState.tab_eq_funcs field, so here's a
> rebased patch.
Thanks for sharing this.
I still need to dive deeply into the code. But I have one annoying user
case where the user complained about a 4x SQL server speedup in
comparison to Postgres, and I guess it is a good benchmark for your code.
This query is remarkable because of high grouping computation load. Of
course, I can't provide the user's data, but I have prepared a synthetic
test to reproduce the case (see attachment).
Comparing the master with and without your patch, the first, I see is
more extensive usage of memory (see complete explains in the attachment):
Current master:
---------------
Partial HashAggregate (cost=54492.60..55588.03 rows=19917 width=889)
(actual time=20621.028..20642.664 rows=10176 loops=9)
Group Key: t1.x1, t1.x2, t1.x3, t1.x4, t1.x5
Batches: 1 Memory Usage: 74513kB
Patched:
--------
Partial HashAggregate (cost=54699.91..55799.69 rows=19996 width=889)
(actual time=57213.280..186216.604 rows=10302738 loops=9)
Group Key: t1.x1, t1.x2, t1.x3, t1.x4, t1.x5
Batches: 261 Memory Usage: 527905kB Disk Usage: 4832656kB
I wonder what causes memory consumption, but it is hard to decide on the
patch's positive outcome for now.
--
regards, Andrei Lepikhov
Attachment | Content-Type | Size |
---|---|---|
bench_results.txt | text/plain | 4.5 KB |
synth.sql | application/sql | 1.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2024-10-31 02:38:58 | Re: Pgoutput not capturing the generated columns |
Previous Message | vignesh C | 2024-10-31 02:13:28 | Re: Pgoutput not capturing the generated columns |