Small overhead run time memory trace (Was Re: shall we have a TRACE_MEMORY mode)

From: "Qingqing Zhou" <zhouqq(at)cs(dot)toronto(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Small overhead run time memory trace (Was Re: shall we have a TRACE_MEMORY mode)
Date: 2006-06-23 02:29:20
Message-ID: e7fjme$1585$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote
>
> One idea that comes to mind is to have a compile time option to record
> the palloc __FILE__ and _LINE__ in every AllocChunk header. Then it
> would not be so hard to identify the culprit while trawling through
> memory. The overhead costs would be so high that you'd never turn it on
> by default though :-(
>
> Another thing to consider is that the proximate location of the palloc
> is frequently *not* very useful. For instance, if your memory is
> getting eaten by lists, all the palloc traces will point at
> new_tail_cell(). Not much help. I don't know what to do about that
> ... any ideas?
>

So basically there are two problems of tracing memory usage:

1. Memory/CPU overhead;
2. Hidden memory allocation calls;

To address problem 1, I think we can even come up with a run time solution
(instead of compiling option). We can have a userset GUC variable

int trace_percent \in [0, 100]

when it is 0, then the trace memory code is non-op, which is used in normal
running mode and this add only two more instructions overhead to each
palloc(). When it is 100, all memory usage are traced. When it is a value
between, this percentage of memory usage are traced --this is good for
*massive* memory leak, since a random probe could catch the suspect. I think
a very small number will do.

To reduce the memory overhead, we have two ways basically. One is that plug
in two uint16 into the AllocChunk, one uint16 for the index of a separeated
maintained __FILE__ list, one for __line__. Another way is that we maintain
all these traces in a totally separate memory context.

For problem 2, the only solution AFAICS for 20 platforms is to redefine
their function.

Regards,
Qingqing

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Gavin Sherry 2006-06-23 02:30:03 Re: vacuum, performance, and MVCC
Previous Message Steve Atkins 2006-06-23 02:08:48 Re: vacuum, performance, and MVCC