From: | Gunther <raj(at)gusw(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org, Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Subject: | Re: Out of Memory errors are frustrating as heck! |
Date: | 2019-04-16 06:33:19 |
Message-ID: | 867396ca-9df7-c873-b095-5df85eaafb71@gusw.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
It is confirmed, these two call paths are the only ones. At least
probably the only ones to occur with enough of a frequency.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
28576 postgres 20 0 2695304 1.0g 200764 R 11.3 13.8 4:20.13 postgres: postgres integrator [local] EXPLAIN
28580 postgres 20 0 646616 432784 36968 S 98.7 5.5 8:53.28 gdb -p 28576
there is a problem with gdb, it also has a memoy leak and is very
expensive with the checking of my conditional breakpoint. So I can't run
it all the way through.
Also here captured with
(gdb) call MemoryContextStats(TopPortalContext)
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
PortalHoldContext: 24632 total in 2 blocks; 7392 free (0 chunks); 17240 used
PortalContext: 1482752 total in 184 blocks; 11216 free (8 chunks); 1471536 used:
ExecutorState: 1369337168 total in 163397 blocks; 248840 free (36 chunks); 1369088328 used
HashTableContext: 32768 total in 3 blocks; 17304 free (10 chunks); 15464 used
HashBatchContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
HashTableContext: 8192 total in 1 blocks; 7320 free (0 chunks); 872 used
HashBatchContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
HashTableContext: 8192 total in 1 blocks; 7320 free (0 chunks); 872 used
HashBatchContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
HashTableContext: 8192 total in 1 blocks; 7752 free (0 chunks); 440 used
HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
HashTableContext: 8192 total in 1 blocks; 7624 free (0 chunks); 568 used
HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
TupleSort main: 32824 total in 2 blocks; 144 free (0 chunks); 32680 used
Caller tuples: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
HashTableContext: 8454256 total in 6 blocks; 64848 free (32 chunks); 8389408 used
HashBatchContext: 106640 total in 3 blocks; 7936 free (0 chunks); 98704 used
TupleSort main: 452880 total in 8 blocks; 126248 free (27 chunks); 326632 used
Caller tuples: 4194304 total in 10 blocks; 1496136 free (20 chunks); 2698168 used ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
...
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
ExprContext: 8192 total in 1 blocks; 7936 free (0 chunks); 256 used
Grand total: 1384601904 bytes in 163660 blocks; 2303840 free (145 chunks); 1382298064 used
there is the developing memory leak. Now let's see if we can trace
individual increments ...
(gdb) info break
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000849030 in AllocSetAlloc at aset.c:718
stop only if (int)strcmp(context->name, "ExecutorState") == 0 && *(int *)$rsp != 0x84e7dd && 0x84e8ad != *(int *)$rsp
breakpoint already hit 4 times
(gdb) delete 1
(gdb) break AllocSetAlloc if (int)strcmp(context->name, "ExecutorState") == 0 && *(int *)$rsp != 0x84e7dd
Breakpoint 2 at 0x849030: file aset.c, line 718.
(gdb) cont
Continuing.
^CError in testing breakpoint condition:
Quit
Breakpoint 2, AllocSetAlloc (context=0x2a1d190, size=381) at aset.c:718
718 {
(gdb) bt 4
#0 AllocSetAlloc (context=0x2a1d190, size=381) at aset.c:718
#1 0x000000000084e7dd in palloc (size=381) at mcxt.c:938
#2 0x00000000006101bc in ExecHashJoinGetSavedTuple (file=file(at)entry=0x4b4a198, hashvalue=hashvalue(at)entry=0x7ffcbf92fe5c,
tupleSlot=0x2ae0ab8, hjstate=0x2a1d920) at nodeHashjoin.c:1277
#3 0x0000000000610ca3 in ExecHashJoinNewBatch (hjstate=0x2a1d920) at nodeHashjoin.c:1042
(More stack frames follow...)
(gdb) call MemoryContextStats(TopPortalContext)
doesn't show an increase of ExecutorState total:
TopPortalContext: 8192 total in 1 blocks; 7664 free (0 chunks); 528 used
PortalHoldContext: 24632 total in 2 blocks; 7392 free (0 chunks); 17240 used
PortalContext: 1482752 total in 184 blocks; 11216 free (8 chunks); 1471536 used:
ExecutorState: 1369337168 total in 163397 blocks; 248840 free (36 chunks); 1369088328 used
exact same as before:
ExecutorState: 1369337168 total in 163397 blocks; 248840 free (36 chunks); 1369088328 used
but now we get an increase to:
ExecutorState: 1369345496 total in 163398 blocks; 248840 free (36 chunks); 1369096656 used
after I did this:
(gdb) cont
Continuing.
Breakpoint 2, AllocSetAlloc (context=0x2a1d190, size=8) at aset.c:718
718 {
(gdb) bt 4
#0 AllocSetAlloc (context=0x2a1d190, size=8) at aset.c:718
#1 0x000000000084e8ad in palloc0 (size=size(at)entry=8) at mcxt.c:969
#2 0x0000000000702b63 in makeBufFileCommon (nfiles=nfiles(at)entry=1) at buffile.c:119
#3 0x0000000000702e4c in makeBufFile (firstfile=163423) at buffile.c:138
(More stack frames follow...)
(gdb) call MemoryContextStats(TopPortalContext)
So now we have it confirmed don't we? No! No we have not! We stop at the
/entrance /of the allocate method. So when I interrupted, there was no
call yet. Then at the next stop the increase was from the previous.
Continuing ... this now is from a stop at the makeBufFileCommon
ExecutorState: 1369345496 total in 163398 blocks; 248816 free (36 chunks); 1369096680 used
And again, now stopped before
ExecutorState: 1369345496 total in 163398 blocks; 248792 free (36 chunks); 1369096704 used
ExecutorState: 1369345496 total in 163398 blocks; 248792 free (36 chunks); 1369096704 used
I don't see a growth between individual invocations. Anyway, these are
the two ways to get there:
(gdb) bt 4
#0 AllocSetAlloc (context=0x2a1d190, size=4) at aset.c:718
#1 0x000000000084e7dd in palloc (size=size(at)entry=4) at mcxt.c:938
#2 0x0000000000702e59 in makeBufFile (firstfile=163423) at buffile.c:140
#3 BufFileCreateTemp (interXact=interXact(at)entry=false) at buffile.c:201
(More stack frames follow...)
(gdb) cont
Continuing.
Breakpoint 3, AllocSetAlloc (context=0x2a1d190, size=394) at aset.c:718
718 {
(gdb) bt 3
#0 AllocSetAlloc (context=0x2a1d190, size=394) at aset.c:718
#1 0x000000000084e7dd in palloc (size=394) at mcxt.c:938
#2 0x00000000006101bc in ExecHashJoinGetSavedTuple (file=file(at)entry=0x4b4a198, hashvalue=hashvalue(at)entry=0x7ffcbf92fe5c,
tupleSlot=0x2ae0ab8, hjstate=0x2a1d920) at nodeHashjoin.c:1277
(More stack frames follow...)
But now it increased
ExecutorState: 1369353824 total in 163399 blocks; 248792 free (36 chunks); 1369105032 used
It increases every 3 times I stop at the breakpoint.
-Gunther
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-04-16 15:30:19 | Re: Out of Memory errors are frustrating as heck! |
Previous Message | Gunther | 2019-04-16 05:23:45 | Re: Out of Memory errors are frustrating as heck! |