From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: 9.5: Better memory accounting, towards memory-bounded HashAgg |
Date: | 2014-11-17 07:31:51 |
Message-ID: | 1416209511.2998.144.camel@jeff-desktop |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, 2014-11-15 at 21:36 +0000, Simon Riggs wrote:
> Do I understand correctly that we are trying to account for exact
> memory usage at palloc/pfree time? Why??
Not palloc chunks, only tracking at the level of allocated blocks (that
we allocate with malloc).
It was a surprise to me that accounting at that level would have any
measurable impact, but Robert found a reasonable case on a POWER machine
that degraded a couple percent. I wasn't able to reproduce it
consistently on x86.
> Or alternatively, can't we just sample the allocations to reduce the overhead?
Not sure quite what you mean by "sample", but it sounds like something
along those lines would work.
Attached is a patch that does something very simple: only tracks blocks
held in the current context, with no inheritance or anything like it.
This reduces it to a couple arithmetic instructions added to the
alloc/dealloc path, so hopefully that removes the general performance
concern raised by Robert[1].
To calculate the total memory used, I included a function
MemoryContextMemAllocated() that walks the memory context and its
children recursively.
Of course, I was originally trying to avoid that, because it moves the
problem to HashAgg. For each group, it will need to execute
MemoryContextMemAllocated() to see if work_mem has been exceeded. It
will have to visit a couple contexts, or perhaps many (in the case of
array_agg, which creates one per group), which could be a measurable
regression for HashAgg.
But if that does turn out to be a problem, I think it's solvable. First,
I could micro-optimize the function by making it iterative rather than
recursive, to save on function call overhead. Second, I could offer a
way to prevent the HTAB from creating its own context, which would be
one less context to visit. And if those don't work, perhaps I could
resort to a sampling method of some kind, as you allude to above.
Regards,
Jeff Davis
[1] I'm fairly sure I tested something very similar on Robert's POWER
machine a while ago, and it was fine. But I think it's offline or moved
now, so I can't verify the results. If this patch still somehow turns
out to be a 1%+ regression on any reasonable test, I don't know what to
say.
Attachment | Content-Type | Size |
---|---|---|
memory-accounting-v7.patch | text/x-patch | 9.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2014-11-17 07:55:38 | Re: inherit support for foreign tables |
Previous Message | Rajeev rastogi | 2014-11-17 05:49:06 | Re: Index scan optimization |