From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Default setting for enable_hashagg_disk |
Date: | 2020-06-25 16:14:56 |
Message-ID: | 1e75737369e1b6182ebaef7d6bbe098ad620bbb6.camel@j-davis.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
On Wed, 2020-06-24 at 15:28 -0400, Robert Haas wrote:
> On Wed, Jun 24, 2020 at 3:14 PM Andres Freund <andres(at)anarazel(dot)de>
> wrote:
> > FWIW, my gut feeling is that we'll end up have to separate the
> > "execution time" spilling from using plain work mem, because it'll
> > trigger spilling too often. E.g. if the plan isn't expected to
> > spill,
> > only spill at 10 x work_mem or something like that. Or we'll need
> > better management of temp file data when there's plenty memory
> > available.
...
> I think that's actually pretty appealing. Separating the memory we
> plan to use from the memory we're willing to use before spilling
> seems
> like a good idea in general, and I think we should probably also do
> it
> in other places - like sorts.
I'm trying to make sense of this. Let's say there are two GUCs:
planner_work_mem=16MB and executor_work_mem=32MB.
And let's say a query comes along and generates a HashAgg path, and the
planner (correctly) thinks if you put all the groups in memory at once,
it would be 24MB. Then the planner, using planner_work_mem, would think
spilling was necessary, and generate a cost that involves spilling.
Then it's going to generate a Sort+Group path, as well. And perhaps it
estimates that sorting all of the tuples in memory would also take
24MB, so it generates a cost that involves spilling to disk.
But it has to choose one of them. We've penalized plans at risk of
spilling to disk, but what's the point? The planner needs to choose one
of them, and both are at risk of spilling to disk.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2020-06-25 16:24:52 | Re: Default setting for enable_hashagg_disk |
Previous Message | Robert Haas | 2020-06-25 15:46:54 | Re: Default setting for enable_hashagg_disk |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2020-06-25 16:24:52 | Re: Default setting for enable_hashagg_disk |
Previous Message | Tom Lane | 2020-06-25 16:07:05 | Re: Why forbid "INSERT INTO t () VALUES ();" |