From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Memory-Bounded Hash Aggregation |
Date: | 2019-07-12 01:06:33 |
Message-ID: | 9be86fc1adc315f69b8af4b379087ab451008d8a.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2019-07-11 at 17:55 +0200, Tomas Vondra wrote:
> Makes sense. I haven't thought about how the hybrid approach would be
> implemented very much, so I can't quite judge how complicated would
> it be
> to extend "approach 1" later. But if you think it's a sensible first
> step,
> I trust you. And I certainly agree we need something to compare the
> other
> approaches against.
Is this a duplicate of your previous email?
I'm slightly confused but I will use the opportunity to put out another
WIP patch. The patch could use a few rounds of cleanup and quality
work, but the funcionality is there and the performance seems
reasonable.
I rebased on master and fixed a few bugs, and most importantly, added
tests.
It seems to be working with grouping sets fine. It will take a little
longer to get good performance numbers, but even for group size of one,
I'm seeing HashAgg get close to Sort+Group in some cases.
You are right that the missed lookups appear to be costly, at least
when the data all fits in system memory. I think it's the cache misses,
because sometimes reducing work_mem improves performance. I'll try
tuning the number of buckets for the hash table and see if that helps.
If not, then the performance still seems pretty good to me.
Of course, HashAgg can beat sort for larger group sizes, but I'll try
to gather some more data on the cross-over point.
Regards,
Jeff Davis
Attachment | Content-Type | Size |
---|---|---|
hashagg-20190711.patch | text/x-patch | 75.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2019-07-12 01:20:07 | Re: Add parallelism and glibc dependent only options to reindexdb |
Previous Message | Bruce Momjian | 2019-07-12 01:05:17 | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |