From: | Gunther <raj(at)gusw(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gunther <raj(at)gusw(dot)net> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Subject: | Re: Out of Memory errors are frustrating as heck! |
Date: | 2019-04-15 03:59:45 |
Message-ID: | 2256ca91-9dac-1fe1-6b23-02c899f8395f@gusw.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 4/14/2019 23:24, Tom Lane wrote:
>> ExecutorState: 2234123384 total in 266261 blocks; 3782328 free (17244 chunks); 2230341056 used
> Oooh, that looks like a memory leak right enough. The ExecutorState
> should not get that big for any reasonable query.
2.2 GB is massive yes.
> Your error and stack trace show a failure in HashBatchContext,
> which is probably the last of these four:
>
>> HashBatchContext: 57432 total in 3 blocks; 16072 free (6 chunks); 41360 used
>> HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>> HashBatchContext: 90288 total in 4 blocks; 16072 free (6 chunks); 74216 used
>> HashBatchContext: 100711712 total in 3065 blocks; 7936 free (0 chunks); 100703776 used
> Perhaps that's more than it should be, but it's silly to obsess over 100M
> when there's a 2.2G problem elsewhere.
Yes.
> I think it's likely that it was
> just coincidence that the failure happened right there. Unfortunately,
> that leaves us with no info about where the actual leak is coming from.
Strange though, that the vmstat tracking never showed that the cache
allocated memory goes much below 6 GB. Even if this 2.2 GB memory leak
is there, and even if I had 2 GB of shared_buffers, I would still have
enough for the OS to give me.
Is there any doubt that this might be a problem with Linux? Because if
you want, I can whip out a FreeBSD machine, compile pgsql, and attach
the same disk, and try it there. I am longing to have a reason to move
back to FreeBSD anyway. But I have tons of stuff to do, so if you do not
have reason to suspect Linux to do wrong here, I prefer skipping that
futile attempt
> The memory map shows that there were three sorts and four hashes going
> on, so I'm not sure I believe that this corresponds to the query plan
> you showed us before.
Like I said, the first explain was not using the same constraints (no
NL). Now what I sent last should all be consistent. Memory dump and
explain plan and gdb backtrace.
> Any chance of extracting a self-contained test case that reproduces this?
With 18 million rows involved in the base tables, hardly.
But I am ready to try some other things with the debugger that you want
me to try. If we have a memory leak issue, we might just as well try to
plug it!
I could even to give someone of you access to the system that runs this.
thanks,
-Gunther
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2019-04-15 04:15:00 | Re: Out of Memory errors are frustrating as heck! |
Previous Message | Tom Lane | 2019-04-15 03:24:59 | Re: Out of Memory errors are frustrating as heck! |