Re: Memory-Bounded Hash Aggregation

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2019-07-04 02:03:06
Message-ID: e13b52d4d8f9c10f7b58f39270213b868c99b685.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2019-07-03 at 02:17 +0200, Tomas Vondra wrote:
> What does "partitioned hash strategy" do? It's probably explained in
> one
> of the historical discussions, but I'm not sure which one. I assume
> it
> simply hashes the group keys and uses that to partition the data, and
> then
> passing it to hash aggregate.

Yes. When spilling, it is cheap to partition on the hash value at the
same time, which dramatically reduces the need to spill multiple times.
Previous discussions:

> Unfortunately the second link does not work :-(

It's supposed to be:

https://www.postgresql.org/message-id/CAGTBQpa__-NP7%3DkKwze_enkqw18vodRxKkOmNhxAPzqkruc-8g%40mail.gmail.com

> I'm not going to block Approach 1, althought I'd really like to see
> something that helps with array_agg.

I have a WIP patch that I just posted. It doesn't yet work with
ARRAY_AGG, but I think it can be made to work by evicting the entire
hash table, serializing the transition states, and then later combining
them.

> Aren't all three approaches a way to "fix" hash aggregate? In any
> case,
> it's certainly reasonable to make incremental changes. The question
> is
> whether "approach 1" is sensible step towards some form of "approach
> 3"

Disk-based hashing certainly seems like a reasonable algorithm on paper
that has some potential advantages over sorting. It certainly seems
sensible to me that we explore the disk-based hashing strategy first,
and then we would at least know what we are missing (if anything) by
going with the hybrid approach later.

There's also a fair amount of design space to explore in the hybrid
strategy. That could take a while to converge, especially if we don't
have anything in place to compare against.

> > * It means we have a hash table and sort running concurrently, each
> > using memory. Andres said this might not be a problem[3], but I'm
> > not convinced that the problem is zero. If you use small work_mem
> > for the write phase of sorting, you'll end up with a lot of runs
> > to
> > merge later and that has some kind of cost.
> >
>
> Why would we need to do both concurrently? I thought we'd empty the
> hash
> table before doing the sort, no?

So you are saying we spill the tuples into a tuplestore, then feed the
tuplestore through a tuplesort? Seems inefficient, but I guess we can.

> Do we actually need to handle that case? How many such aggregates are
> there? I think it's OK to just ignore that case (and keep doing what
> we do
> now), and require serial/deserial functions for anything better.

Punting on a few cases is fine with me, if the user has a way to fix
it.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-07-04 02:35:21 Re: Where is SSPI auth username determined for TAP tests?
Previous Message David Rowley 2019-07-04 01:51:06 Re: Cleaning up and speeding up string functions