Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Serge Rielau <serge(at)rielau(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Oleg Golovanov <rentech(at)mail(dot)ru>
Subject: Re: Treating work_mem as a shared resource (Was: Parallel Hash take II)
Date: 2017-11-17 15:31:19
Message-ID: CA+TgmoaXBEEsn7pqFGGNqXh0RQV1GDZuXL00cqfxnJBoBAMYUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 16, 2017 at 11:50 AM, Serge Rielau <serge(at)rielau(dot)com> wrote:

> Just as you have, we have also considered holistic provisioning of
> work_mem across all consumers, but we find that to be too complex.
> Having an “emergency fund” in shared memory is also an option, but I find
> it too limiting.
>

I agree.

I think this is basically a planning problem. For example, say we wanted
to have work_mem_per_query instead of work_mem_per_node. There is an
obvious design: consider memory use as an independent dimension of merit
during path generation and comparison (less is better). Discard candidate
paths whose memory use exceeds the work_mem_per_query budget unless there
are no other alternatives. At the end of planning, pick the cheapest path
that survived the memory-budget filter. Now, this has the problem that it
would make planning more expensive (because we'd hang on to more paths for
longer) but it solves a lot of other problems. If there's no memory
pressure, we can use memory like mad even when it doesn't save much, but
when we have to pick between using more memory for one part of the plan and
using more memory for another part of the plan, the choice that does the
best job reducing overall execution time will win. Awesome.

We could also do more localized variants of this that don't provide hard
guarantees but do tend to avoid squandering resources. I don't think that
we can directly incorporate memory use into cost because that will distort
the costs of higher-level nodes in the plan tree; cost needs to mean
execution time. However, what we could do is refuse to replace a more
expensive path in a relation's path list with a cheaper one when the
savings are small and the cheaper path uses a lot more memory. That way,
you wouldn't replace a nested loop that costs a million units with a hash
join that costs 999,999 units but uses a GB of RAM; you'd save the hash
join for cases where we think it will help significantly.

Yet another thing we could do is to try to get nodes to voluntarily use
less than work_mem when possible. This is particularly an issue for
sorts. A 2-batch hash join is so much more expensive than a single-batch
hash join that it's almost never going to make sense unless we have no
realistic alternative, although I suppose a 64-batch hash join might be not
that different from a 32-batch hash join. But for sorts, given all Peter's
work in this area, I bet there are a lot of sorts that could budget a
quarter or less of work_mem and really not be hurt very much. It depends
somewhat on how fast and how contended your I/O is, though, which we don't
have an especially good way to model. I'm starting to wonder if that
sort_mem GUC might be a good idea... use that for sorts, and keep work_mem
for everything else.

If we really want to be able to dynamically react to change memory
conditions, what we need is a forest of plans for a given query rather than
just one. Pick plan A if memory is limited, otherwise pick B. Or use
admission control.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-11-17 16:05:28 Re: Inlining functions with "expensive" parameters
Previous Message Justin Pryzby 2017-11-17 15:27:14 PG10.1 autovac killed building extended stats