Quick Links

Re: 9.5: Memory-bounded HashAgg

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: 9.5: Memory-bounded HashAgg
Date:	2014-08-14 14:06:54
Message-ID:	2219.1408025214@sss.pgh.pa.us
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Jeff Davis <pgsql(at)j-davis(dot)com> writes:
> I think the hash-join like approach is reasonable, but I also think
> you're going to run into a lot of challenges that make it more complex
> for HashAgg. For instance, let's say you have the query:

> SELECT x, array_agg(y) FROM foo GROUP BY x;

> Say the transition state is an array (for the sake of simplicity), so
> the hash table has something like:

> 1000 => {7, 8, 9}
> 1001 => {12, 13, 14}

> You run out of memory and need to split the hash table, so you scan the
> hash table and find that group 1001 needs to be written to disk. So you
> serialize the key and array and write them out.

> Then the next tuple you get is (1001, 19). What do you do? Create a new
> group 1001 => {19} (how do you combine it later with the first one)? Or
> try to fetch the existing group 1001 from disk and advance it (horrible
> random I/O)?

If you're following the HashJoin model, then what you do is the same thing
it does: you write the input tuple back out to the pending batch file for
the hash partition that now contains key 1001, whence it will be processed
when you get to that partition. I don't see that there's any special case
here.

The fly in the ointment is how to serialize a partially-computed aggregate
state value to disk, if it's not of a defined SQL type.

regards, tom lane

In response to

Re: 9.5: Memory-bounded HashAgg at 2014-08-14 07:22:28 from Jeff Davis

Responses

Re: 9.5: Memory-bounded HashAgg at 2014-08-14 15:39:24 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2014-08-14 14:10:01	Re: psql \watch versus \timing
Previous Message	Stephen Frost	2014-08-14 13:49:03	Re: replication commands and log_statements