From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Disk-based hash aggregate's cost model |
Date: | 2020-08-30 00:26:20 |
Message-ID: | 20200830002620.pabcrarjbv3j3dlj@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 28, 2020 at 06:32:38PM -0700, Jeff Davis wrote:
>On Thu, 2020-08-27 at 17:28 -0700, Peter Geoghegan wrote:
>> We have a Postgres 13 open item for Disk-based hash aggregate, which
>> is the only non-trivial open item. There is a general concern about
>> how often we get disk-based hash aggregation when work_mem is
>> particularly low, and recursion seems unavoidable. This is generally
>> thought to be a problem in the costing.
>
>We discussed two approaches to tweaking the cost model:
>
>1. Penalize HashAgg disk costs by a constant amount. It seems to be
>chosen a little too often, and we can reduce the number of plan
>changes.
>
>2. Treat recursive HashAgg spilling skeptically, and heavily penalize
>recursive spilling.
>
>The problem with approach #2 is that we have a default hash mem of 4MB,
>and real systems have a lot more than that. In this scenario, recursive
>spilling can beat Sort by a lot.
>
I think the main issue is that we're mostly speculating what's wrong.
I've shared some measurements and symptoms, and we've discussed what
might be causing it, but I'm not really sure we know for sure.
I really dislike (1) because it seems more like "We don't know what's
wrong so we'll penalize hashagg," kind of solution. A much more
principled solution would be to tweak the costing accordingly, not just
by adding some constant. For (2) it really depends if recursive spilling
is really the problem here. In the examples I shared, the number of
partitions/batches was very different, but the query duration was
mostly independent (almost constant).
FWIW I still haven't seen any explanation why the current code spills
more data than the CP_SMALL_TLIST patch (which was reverted).
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-08-30 00:39:08 | Re: Background writer and checkpointer in crash recovery |
Previous Message | Thomas Munro | 2020-08-30 00:12:15 | Background writer and checkpointer in crash recovery |