From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Re: Spilling hashed SetOps and aggregates to disk |
Date: | 2018-06-04 18:52:05 |
Message-ID: | 20180604185205.epue25jzpavokupf@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2018-06-04 10:32:47 +0200, Heikki Linnakangas wrote:
> Hash Aggs and SetOps are currently not spilled to disk. If the planner's
> estimate on the number of entries is badly off, you might run out of memory
> at execution time, if all the entries don't fit in memory.
>
> For HashAggs, this was discussed in depth a couple of years ago at [1].
> SetOps have the same issue, but fixing that is simpler, as you don't need to
> handle arbitrary aggregate transition values and functions.
That part has gotten a bit easier since, because we have serialize /
deserialize operations for aggregates these days.
I wonder whether, at least for aggregates, the better fix wouldn't be to
switch to feeding the tuples into tuplesort upon memory exhaustion and
doing a sort based aggregate. We have most of the infrastructure to do
that due to grouping sets. It's just the pre-existing in-memory tuples
that'd be problematic, in that the current transition values would need
to serialized as well. But with a stable sort that'd not be
particularly problematic, and that could easily be achieved.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-06-04 18:55:28 | Re: plans for PostgreSQL 12 |
Previous Message | Andres Freund | 2018-06-04 18:42:23 | Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager |