From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Default setting for enable_hashagg_disk |
Date: | 2020-06-24 12:38:43 |
Message-ID: | 20200624123843.GX4107@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs pgsql-hackers |
On Wed, Jun 24, 2020 at 05:06:28AM -0400, Bruce Momjian wrote:
> On Wed, Jun 24, 2020 at 02:11:57PM +1200, David Rowley wrote:
> > On Tue, 23 Jun 2020 at 08:24, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> > > Another way of looking at it is that the weird behavior is already
> > > there in v12, so there are already users relying on this weird behavior
> > > as a crutch for some other planner mistake. The question is whether we
> > > want to:
> > >
> > > (a) take the weird behavior away now as a consequence of implementing
> > > disk-based HashAgg; or
> > > (b) support the weird behavior forever; or
> > > (c) introduce a GUC now to help transition away from the weird behavior
> > >
> > > The danger with (c) is that it gives users more time to become more
> > > reliant on the weird behavior; and worse, a GUC could be seen as an
> > > endorsement of the weird behavior rather than a path to eliminating it.
> > > So we could intend to do (c) and end up with (b). We can mitigate this
> > > with documentation warnings, perhaps.
> >
> > So, I have a few thoughts on this subject. I understand both problem
> > cases have been mentioned before on this thread, but just to reiterate
> > the two problem cases that we really would rather people didn't hit.
>
> I appreciated this summary since I wasn't fully following the issues.
>
> > As for GUCs to try to help the group of users who, *I'm certain*, will
> > have problems with PG13's plan choice. I think the overloaded
> > enable_hashagg option is a really nice compromise. We don't really
> > have any other executor node type that has multiple GUCs controlling
> > its behaviour, so I believe it would be nice to keep it that way.
...
> It would seem merge join has almost the same complexities as the new
> hash join code, since it can spill to disk doing sorts for merge joins,
> and adjusting work_mem is the only way to control that spill to disk. I
> don't remember anyone complaining about spills to disk during merge
> join, so I am unclear why we would need a such control for hash join.
It loooks like merge join was new in 8.3. I don't think that's a good analogy,
since the old behavior was still available with enable_mergejoin=off.
I think a better analogy would be if we now changed sort nodes beneath merge
join to use at most 0.5*work_mem, with no way of going back to using
1.0*work_mem.
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2020-06-24 17:08:34 | Re: Default setting for enable_hashagg_disk |
Previous Message | David Rowley | 2020-06-24 12:24:29 | Re: Default setting for enable_hashagg_disk |
From | Date | Subject | |
---|---|---|---|
Next Message | ROS Didier | 2020-06-24 12:39:38 | RE: PostgreSQL and big data - FDW |
Previous Message | David Rowley | 2020-06-24 12:24:29 | Re: Default setting for enable_hashagg_disk |