From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Default setting for enable_hashagg_disk |
Date: | 2020-07-08 14:00:37 |
Message-ID: | 20200708140037.GI3125@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
Greetings,
* Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
> On 2020-Jun-25, Andres Freund wrote:
>
> > >My point here is that maybe we don't need to offer a GUC to explicitly
> > >turn spilling off; it seems sufficient to let users change work_mem so
> > >that spilling will naturally not occur. Why do we need more?
> >
> > That's not really a useful escape hatch, because I'll often lead to
> > other nodes using more memory.
>
> Ah -- other nodes in the same query -- you're right, that's not good.
It's exactly how the system has been operating for, basically, forever,
for everything. Yes, it'd be good to have a way to manage the
overall amount of memory that a query is allowed to use but that's a
huge change and inventing some new 'hash_mem' or some such GUC doesn't
strike me as a move in the right direction- are we going to have
sort_mem next? What if having one large hash table for aggregation
would be good, but having the other aggregate use a lot of memory would
run the system out of memory? Yes, we need to do better, but inventing
new node_mem GUCs isn't the direction to go in.
That HashAgg previously didn't care that it was going wayyyyy over
work_mem was, if anything, a bug. Inventing new GUCs late in the
cycle like this under duress seems like a *really* bad idea. Yes,
people are going to have to adjust work_mem if they want these queries
to continue using a ton of memory to run when the planner didn't think
it'd actually take that much memory- but then, in lots of the kinds of
cases that I think you're worrying about, the stats aren't actually that
far off and people did increase work_mem to get the HashAgg plan in the
first place.
I'm also in support of having enable_hashagg_disk set to true as the
default, just like all of the other enable_*'s.
Thanks,
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2020-07-09 06:47:43 | Re: Default setting for enable_hashagg_disk |
Previous Message | Tom Lane | 2020-07-08 13:21:57 | Re: missing epoch timestamps literals in examples |
From | Date | Subject | |
---|---|---|---|
Next Message | Ajin Cherian | 2020-07-08 14:01:19 | Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions |
Previous Message | Andrew Dunstan | 2020-07-08 13:54:35 | Re: TAP tests and symlinks on Windows |