From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize |
Date: | 2013-06-22 07:46:49 |
Message-ID: | 20130622074649.GE7093@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Noah,
* Noah Misch (noah(at)leadboat(dot)com) wrote:
> This patch introduces MemoryContextAllocHuge() and repalloc_huge() that check
> a higher MaxAllocHugeSize limit of SIZE_MAX/2.
Nice! I've complained about this limit a few different times and just
never got around to addressing it.
> This was made easier by tuplesort growth algorithm improvements in commit
> 8ae35e91807508872cabd3b0e8db35fc78e194ac. The problem has come up before
> (TODO item "Allow sorts to use more available memory"), and Tom floated the
> idea[1] behind the approach I've used. The next limit faced by sorts is
> INT_MAX concurrent tuples in memory, which limits helpful work_mem to about
> 150 GiB when sorting int4.
That's frustratingly small. :(
[...]
> --- 1024,1041 ----
> * new array elements even if no other memory were currently used.
> *
> * We do the arithmetic in float8, because otherwise the product of
> ! * memtupsize and allowedMem could overflow. Any inaccuracy in the
> ! * result should be insignificant; but even if we computed a
> ! * completely insane result, the checks below will prevent anything
> ! * really bad from happening.
> */
> double grow_ratio;
>
> grow_ratio = (double) state->allowedMem / (double) memNowUsed;
> ! if (memtupsize * grow_ratio < INT_MAX)
> ! newmemtupsize = (int) (memtupsize * grow_ratio);
> ! else
> ! newmemtupsize = INT_MAX;
>
> /* We won't make any further enlargement attempts */
> state->growmemtuples = false;
I'm not a huge fan of moving directly to INT_MAX. Are we confident that
everything can handle that cleanly..? I feel like it might be a bit
safer to shy a bit short of INT_MAX (say, by 1K). Perhaps that's overly
paranoid, but there's an awful lot of callers and some loop which +2's
and then overflows would suck, eg:
int x = INT_MAX;
for (x-1; (x-1) < INT_MAX; x += 2) {
myarray[x] = 5;
}
Also, could this be used to support hashing larger sets..? If we change
NTUP_PER_BUCKET to one, we could end up wanting to create a hash table
larger than INT_MAX since, with 8-byte pointers, that'd only be around
134M tuples.
Haven't had a chance to review the rest, but +1 on the overall idea. :)
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2013-06-22 10:36:45 | Re: MemoryContextAllocHuge(): selectively bypassing MaxAllocSize |
Previous Message | Stephen Frost | 2013-06-22 06:24:01 | Re: WITH CHECK OPTION for auto-updatable views |