Quick Links

Re: Out of Memory errors are frustrating as heck!

From:	Justin Pryzby <pryzby(at)telsasoft(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gunther <raj(at)gusw(dot)net>
Cc:	pgsql-performance(at)lists(dot)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject:	Re: Out of Memory errors are frustrating as heck!
Date:	2019-04-20 19:30:09
Message-ID:	20190420193009.GC4323@telsasoft.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Sun, Apr 14, 2019 at 11:24:59PM -0400, Tom Lane wrote:
> Gunther <raj(at)gusw(dot)net> writes:
> > ExecutorState: 2234123384 total in 266261 blocks; 3782328 free (17244 chunks); 2230341056 used
>
> Oooh, that looks like a memory leak right enough. The ExecutorState
> should not get that big for any reasonable query.

On Tue, Apr 16, 2019 at 11:30:19AM -0400, Tom Lane wrote:
> Hmm ... this matches up with a vague thought I had that for some reason
> the hash join might be spawning a huge number of separate batches.
> Each batch would have a couple of files with associated in-memory
> state including an 8K I/O buffer, so you could account for the

On Tue, Apr 16, 2019 at 10:24:53PM -0400, Gunther wrote:
> -> Hash (cost=2861845.87..2861845.87 rows=34619 width=74) (actual time=199792.446..199792.446 rows=113478127 loops=1)
> Buckets: 65536 (originally 65536) Batches: 131072 (originally 2) Memory Usage: 189207kB

Is it significant that there are ~2x as many ExecutorState blocks as there are
batches ? 266261/131072 => 2.03...

If there was 1 blocks leaked when batch=2, and 2 blocks leaked when batch=4,
and 4 blocks leaked when batch=131072, then when batch=16, there'd be 64k
leaked blocks, and 131072 total blocks.

I'm guessing Tom probably already thought of this, but:
2230341056/266261 => ~8376
which is pretty close to the 8kB I/O buffer you were talking about (if the
number of same-sized buffers much greater than other allocations).

If Tom thinks (as I understand) that the issue is *not* a memory leak, but out
of control increasing of nbatches, and failure to account for their size...then
this patch might help.

The number of batches is increased to avoid exceeding work_mem. With very low
work_mem (or very larger number of tuples hashed), it'll try to use a large
number of batches. At some point the memory used by BatchFiles structure
(increasing by powers of two) itself exceeds work_mem.

With larger work_mem, there's less need for more batches. So the number of
batches used for small work_mem needs to be constrained, either based on
work_mem, or at all.

With my patch, the number of batches is nonlinear WRT work_mem, and reaches a
maximum for moderately small work_mem. The goal is to choose the optimal
number of batches to minimize the degree to which work_mem is exceeded.

Justin

Attachment	Content-Type	Size
limit-hash-nbatches.patch	text/x-diff	1.4 KB

In response to

Re: Out of Memory errors are frustrating as heck! at 2019-04-17 02:24:53 from Gunther

Responses

Re: Out of Memory errors are frustrating as heck! at 2019-04-20 20:11:57 from Tomas Vondra
Re: Out of Memory errors are frustrating as heck! at 2019-04-21 05:31:06 from Gunther

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Gunther	2019-04-20 20:00:18	Re: Out of Memory errors are frustrating as heck!
Previous Message	Daulat Ram	2019-04-20 18:50:47	Backup and Restore (pg_dump & pg_restore)