Re: Default setting for enable_hashagg_disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-13 14:11:59
Message-ID: 20200713141159.yc5tbzkrhwzj3ru5@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Mon, Jul 13, 2020 at 01:51:42PM +0200, Peter Eisentraut wrote:
>On 2020-04-07 20:20, Jeff Davis wrote:
>>Now that we have Disk-based Hash Aggregation, there are a lot more
>>situations where the planner can choose HashAgg. The
>>enable_hashagg_disk GUC, if set to true, chooses HashAgg based on
>>costing. If false, it only generates a HashAgg path if it thinks it
>>will fit in work_mem, similar to the old behavior (though it wlil now
>>spill to disk if the planner was wrong about it fitting in work_mem).
>>The current default is true.
>
>I have an anecdote that might be related to this discussion.
>
>I was running an unrelated benchmark suite. With PostgreSQL 12, one
>query ran out of memory. With PostgreSQL 13, the same query instead
>ran out of disk space. I bisected this to the introduction of
>disk-based hash aggregation. Of course, the very point of that
>feature is to eliminate the out of memory and make use of disk space
>instead. But running out of disk space is likely to be a worse
>experience than running out of memory. Also, while it's relatively
>easy to limit memory use both in PostgreSQL and in the kernel, it is
>difficult or impossible to limit disk space use in a similar way.
>

Why is running out of disk space worse experience than running out of
memory?

Sure, it'll take longer and ultimately the query fails (and if it fills
the device used by the WAL then it may also cause shutdown of the main
instance due to inability to write WAL). But that can be prevented by
moving the temp tablespace and/or setting the temp file limit, as
already mentioned.

With OOM, if the kernel OOM killer decides to act, it may easily bring
down the instance too, and there are much less options to prevent that.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message David Rowley 2020-07-13 14:25:31 Re: Default setting for enable_hashagg_disk
Previous Message Pavel Stehule 2020-07-13 13:23:38 Re: docs: psql and variable interpolation

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-07-13 14:25:31 Re: Default setting for enable_hashagg_disk
Previous Message Daniel Gustafsson 2020-07-13 14:07:32 Commitfest 2020-07 almost halfway