Quick Links

Re: Memory-Bounded Hash Aggregation

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Memory-Bounded Hash Aggregation
Date:	2020-02-04 16:10:15
Message-ID:	b2c6d4c1-9298-4dfd-2763-cd080b23a826@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 03/02/2020 20:29, Jeff Davis wrote:
> 1. Use a minheap for the freelist. The original design used an array
> that had to be sorted between a read (which frees a block) and a write
> (which needs to sort the array to consume the lowest block number). The
> comments said:
>
> * sorted. This is an efficient way to handle it because we expect
> cycles
> * of releasing many blocks followed by re-using many blocks, due to
> * the larger read buffer.
>
> But I didn't find a case where that actually wins over a simple
> minheap. With that in mind, a minheap seems closer to what one might
> expect for that purpose, and more robust when the assumptions don't
> hold up as well. If someone knows of a case where the re-sorting
> behavior is important, please let me know.

A minheap certainly seems more natural for that. I guess re-sorting the
array would be faster in the extreme case that you free almost all of
the blocks, and then consume almost all of the blocks, but I don't think
the usage pattern is ever that extreme. Because if all the data fit in
memory, we wouldn't be spilling in the first place.

I wonder if a more advanced heap like the pairing heap or fibonacci heap
would perform better? Probably doesn't matter in practice, so better
keep it simple...

> Changing to a minheap effectively solves the problem for HashAgg,
> though in theory the memory consumption of the freelist itself could
> become significant (though it's only 0.1% of the free space being
> tracked).

We could fairly easily spill parts of the freelist to disk, too, if
necessary. But it's probably not worth the trouble.

> 2. Lazily-allocate the read buffer. The write buffer was always lazily-
> allocated, so this patch creates better symmetry. More importantly, it
> means freshly-rewound tapes don't have any buffer allocated, so it
> greatly expands the number of tapes that can be managed efficiently as
> long as only a limited number are active at once.

Makes sense.

> 3. Allow expanding the number of tapes for an existing tape set. This
> is useful for HashAgg, which doesn't know how many tapes will be needed
> in advance.

I'd love to change the LogicalTape API so that you could allocate and
free tapes more freely. I wrote a patch to do that, as part of replacing
tuplesort.c's polyphase algorithm with a simpler one (see [1]), but I
never got around to committing it. Maybe the time is ripe to do that now?

[1]
https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83@iki.fi

- Heikki

In response to

Re: Memory-Bounded Hash Aggregation at 2020-02-03 18:29:55 from Jeff Davis

Responses

Re: Memory-Bounded Hash Aggregation at 2020-02-05 19:56:11 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Konstantin Knizhnik	2020-02-04 16:47:47	Re: [Proposal] Global temporary tables
Previous Message	曾文旌 (义从)	2020-02-04 15:01:37	Re: [Proposal] Global temporary tables