Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"

From: brent_despain(at)selinc(dot)com
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13530: sort receives "unexpected out-of-memory situation during sort"
Date: 2015-08-03 19:06:45
Message-ID: OFBAE493E7.9C369902-ON88257E96.0068E5C3-87257E96.0068FD73@selinc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thanks Tom for the help. We will see if we can capture some additional
information to help with the investigation.

Brent DeSpain
Automation & Integration Engineering
Schweitzer Engineering Laboratories, Inc.
Boise, ID
Wk: 509-334-8007
mailto:brent_despain(at)selinc(dot)com

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: brent_despain(at)selinc(dot)com,
Cc: pgsql-bugs(at)postgresql(dot)org
Date: 08/01/2015 11:56 AM
Subject: Re: [BUGS] BUG #13530: sort receives "unexpected
out-of-memory situation during sort"

I wrote:
> brent_despain(at)selinc(dot)com writes:
>> We are occasionally receiving "unexpected out-of-memory situation
during
>> sort".

> Hmm. Looking at the code here, it suddenly strikes me that it's
assuming
> that LACKMEM() wasn't true to begin with, and that this is not
guaranteed,
> because we adjust the memory consumption counter *before* we call
> puttuple_common.

In the light of day that theory doesn't hold up, because if LACKMEM
were true on entry (ie, availMem < 0) then we'd compute a grow_ratio
less than one, so that the "Must enlarge array by at least one element"
check would trigger, and we'd never get to the code that's failing.

Another idea that occurred to me is that the "memory chunk overhead won't
increase" assumption could fail if sizeof(SortTuple) weren't a multiple of
MAXALIGN (because repalloc internally rounds the request up to a MAXALIGN
boundary) but that doesn't seem plausible either. I'd expect that struct
to be 16 bytes on 32-bit or 24 bytes on 64-bit, so it should be maxaligned
on any hardware I know about.

So I'm about out of ideas. Could you modify your copy of the code to
log interesting details when you get this error, like the old and new
memtupsize and chunk space measurements? That might give us a clue
what's the problem.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message tumelolilele 2015-08-03 20:11:35 BUG #13537: Npgsql bug
Previous Message lthompson 2015-08-03 18:23:38 BUG #13536: SQLParamData thows "Invalid Endian" error