Re: Out of Memory errors are frustrating as heck!

From: Gunther <raj(at)gusw(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-04-16 01:49:50
Message-ID: ddd4cc94-83a4-3210-bda3-70afd535cd53@gusw.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Wow, we are getting somewhere.

Tom (BTW, your mail server rejects my direct mail, but I'm glad you got
it through the list), you say:

> I'm pretty sure that's not the droid we're looking for.
> ExecHashJoinGetSavedTuple does palloc a new tuple, but it immediately
> sticks it into a TupleTableSlot that will be responsible for freeing
> it (when the next tuple is stuck into the same slot). I'd suggest
> continuing a few times and looking for other code paths leading
> to AllocSetAlloc in this context.

I did continue a "few times", but few as in a dozen, it's always the same

(gdb) bt 6
#0 AllocSetAlloc (context=0x1168230, size=375) at aset.c:715
#1 0x000000000084e6cd in palloc (size=375) at mcxt.c:938
#2 0x000000000061019c in ExecHashJoinGetSavedTuple (file=file(at)entry=0x21df688, hashvalue=hashvalue(at)entry=0x7fff2e4ca76c,
tupleSlot=0x10856b8, hjstate=0x11688e0) at nodeHashjoin.c:1277
#3 0x0000000000610c83 in ExecHashJoinNewBatch (hjstate=0x11688e0) at nodeHashjoin.c:1042
#4 ExecHashJoinImpl (parallel=false, pstate=0x11688e0) at nodeHashjoin.c:539
#5 ExecHashJoin (pstate=0x11688e0) at nodeHashjoin.c:565
(More stack frames follow...)

So I decided to just let it go until it exits the ExecHashJoin function:

#6 0x00000000005fde68 in ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:461
461 result = node->ExecProcNodeReal(node);
(gdb) list
456 {
457 TupleTableSlot *result;
458
459 InstrStartNode(node->instrument);
460
461 result = node->ExecProcNodeReal(node);
462
463 InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);
464
465 return result;
(gdb) break 463
Breakpoint 3 at 0x5fde68: file execProcnode.c, line 463.
(gdb) disable 2
(gdb) cont
Continuing.

Breakpoint 3, ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:463
463 InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);

oops, that was fast, so up further ...

(gdb) cont
Continuing.

Breakpoint 4, ExecSort (pstate=0x11687d0) at nodeSort.c:109
109 if (TupIsNull(slot))
(gdb) cont
Continuing.

Breakpoint 3, ExecProcNodeInstr (node=0x11688e0) at execProcnode.c:463
463 InstrStopNode(node->instrument, TupIsNull(result) ? 0.0 : 1.0);
(gdb) cont
Continuing.

Breakpoint 4, ExecSort (pstate=0x11687d0) at nodeSort.c:109
109 if (TupIsNull(slot))
(gdb) up
#1 0x00000000005fde68 in ExecProcNodeInstr (node=0x11687d0) at execProcnode.c:461
461 result = node->ExecProcNodeReal(node);
(gdb) up
#2 0x000000000061d2c4 in ExecProcNode (node=0x11687d0) at ../../../src/include/executor/executor.h:247
247 return node->ExecProcNode(node);
(gdb) up
#3 ExecUnique (pstate=0x11685e0) at nodeUnique.c:73
73 slot = ExecProcNode(outerPlan);
(gdb) list
68 for (;;)
69 {
70 /*
71 * fetch a tuple from the outer subplan
72 */
73 slot = ExecProcNode(outerPlan);
74 if (TupIsNull(slot))
75 {
76 /* end of subplan, so we're done */
77 ExecClearTuple(resultTupleSlot);

... but whatever I do, ultimately I get to that allocation routine
through the same path.

Since that is the bulk of the activity, and memory was still growing
while we come through this path, I assume that this is it.

> My first thought on noticing the SELECT DISTINCT was that you might be
> hitting the grouping-function-related leak that Andres fixed in 9cf37a527;
> but that fix did make it into 11.2 (by just a couple of days...). Still,
> maybe there's another issue in the same area.

I don't know about that one, I only know that I am running 11.2 freshly
compiled.

The change suggested by Alvaro Herrera wasn't applicable.

Jeff Janes had more

> Breakpoint 2, AllocSetAlloc (context=0x1168230, size=8272) at aset.c:715
> 715 {
> (gdb) p context->name
> $8 = 0x96ce5b "ExecutorState"
>
>
> I think that the above one might have been the one you wanted.
Not sure how you could tell that? It's the same place as everything
else. If we can find out what you're looking for, may be we can set a
break point earlier up the call chain?
>
> I guess I should run this for a little longer. So I disable my
> breakpoints
>
> it went up pretty quick from 1.2 GB to 1.5 GB, but then it stopped
> growing fast, so now back to gdb and break:
>
> Unfortunately, I think this means you missed your opportunity and are
> now getting backtraces of the innocent bystanders.
But why? If I see the memory still go up insanely fast, isn't that a
sign for the leak?
> Particularly since you report that the version using nested loops
> rather than hash joins also leaked, so it is probably not the
> hash-join specific code that is doing it.
How about it's in the DISTINCT? I noticed while peeking up the call
chain, that it was already in the UNIQUE sort thing also.  I guess it's
streaming the results from the hash join right into the unique sort step.
> What I've done before is compile with the comments removed from
> src/backend/utils/mmgr/aset.c:/* #define HAVE_ALLOCINFO */
I have just done that and it creates an insane amount of output from all
the processes, I'm afraid there will be no way to keep that stuff
separated. If there was a way of turning that one and off for one
process only, then we could probably get more info...

Everything is also extremely slow that way. Like in a half hour the
memory didn't even reach 100 MB.

> and then look for allocations sizes which are getting allocated but
> not freed, and then you can go back to gdb to look for allocations of
> those specific sizes.
I guess I should look for both, address and size to match it better.
> This generates a massive amount of output, and it bypasses the logging
> configuration and goes directly to stderr--so it might not end up
> where you expect.
Yes, massive, like I said. Impossible to use. File system fills up
rapidly. I made it so that it can be turned on and off, with the debugger.

int _alloc_info = 0;
#ifdef HAVE_ALLOCINFO
#define AllocFreeInfo(_cxt, _chunk) \
if(_alloc_info) \
fprintf(stderr, "AllocFree: %s: %p, %zu\n", \
(_cxt)->header.name, (_chunk), (_chunk)->size)
#define AllocAllocInfo(_cxt, _chunk) \
if(_alloc_info) \
fprintf(stderr, "AllocAlloc: %s: %p, %zu\n", \
(_cxt)->header.name, (_chunk), (_chunk)->size)
#else
#define AllocFreeInfo(_cxt, _chunk)
#define AllocAllocInfo(_cxt, _chunk)
#endif

so with this I do

(gdb) b AllocSetAlloc
(gdb) cont
(gdb) set _alloc_info=1
(gdb) disable
(gdb) cont

then I wait, ... until it crashes again ... no, it's too much. It fills
up my filesystem in no time with the logs.  It produced 3 GB in just a
minute of run time.

And also, I doubt we can find anything specifically by allocation size.
It's just going to be 512 or whatever.

Isn't there some other way?

I'm going to try without that DISTINCT step, or perhaps by dismantling
this query until it works without this excessive memory growth.

-Gunther

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2019-04-16 02:08:45 Re: PostgreSQL upgrade.
Previous Message Jeff Janes 2019-04-15 18:14:07 Re: Out of Memory errors are frustrating as heck!