Re: Problem reloading regression database

From: Brent Verner <brent(at)rcfile(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Problem reloading regression database
Date: 2002-01-14 05:29:32
Message-ID: 20020114052932.GA3810@rcfile.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[2002-01-14 00:03] Tom Lane said:
| Brent Verner <brent(at)rcfile(dot)org> writes:
| > yup. The problem sneaks up when I get a default value for a "text"
| > column via ExecEvalExprSwithContext. Commenting out the pfree above
| > heap_formtuple makes the error go away, but I know that's not the
| > right answer.
|
| Oh, the pfree for the attribute values? Ah so. I knew that would
| bite us someday. See, the way this code presently works is that all of
| copy.c runs in the per-query memory context. It calls all of the
| datatype conversion routines in that same context. It assumes that
| the routines that return pass-by-ref datatypes will return palloc'd
| values (and not, say, pointers to constant values) --- which is not
| a good assumption IMHO, even though I think it's true at the moment.
| This assumption is what's needed to justify the pfree's at the bottom of
| the loop. What's even worse is that it assumes that the conversion
| routines leak no other memory; if any conversion routine palloc's
| something it doesn't pfree, then over the course of a long enough copy
| we run out of memory.

check. I just loaded 3mil records (with my hacked copy.c), and had
ended up around 36M... <phew!!> I'm gonna load a similar file
with a clean copy.c, just to see if that leak is present without
my changes -- I suspect it's not, but I'd like to see the empirical
effect of my change(s)....

Thanks for the commentary. It really helps glue together the
thoughts I had from reading over the memory context code.

| In the case of ExecEvalExpr, if the expression is just a T_Const node
| then what you get back (for a pass-by-ref datatype) is a pointer to
| the value sitting in the Const node. pfreeing this is bad juju.

yup, that seems like it'd explain my symptom.

| > Should I avoid freeing the !attbyval items when they've
| > come from ExecEvalExpr -- I don't see any other examples of freeing
| > returns from this fn.
|
| I believe the correct solution is to get rid of the retail pfree's
| altogether. The clean way to run this code would be to switch to
| the per-tuple context at the head of the per-tuple loop (say, right
| after ResetPerTupleExprContext), run all the datatype conversion
| routines *and* ExecEvalExpr in this context, and then switch back
| to per-query context just before heap_formtuple. Then at the
| loop bottom the only explicit free you need is the heap_freetuple.
| The individual attribute values are inside the per-tuple context
| and they'll be freed by the ResetPerTupleExprContext at the start
| of the next loop. Fewer cycles, works right whether the values are
| palloc'd or not, and positively prevents any problems with leaks
| inside the datatype conversion routines --- since any leaked pallocs
| will also be inside the per-tuple context.

Gotcha. This certainly sounds like it will alleviate my pfree
problem. I'll get back to this tomorrow evening.

| An even more radical approach would be to try to run the whole loop in
| per-tuple context, but I think that will probably break things; the
| index insertion code, at least, expects to be called in per-query
| context because it sometimes makes allocations that must live across
| calls. (Cleaning that up is on my long-term to-do list; I'd prefer
| to see almost all of the executor run in per-tuple contexts, so as
| to avoid potential memory leaks very similar to the situation here.)
|
| You'll need to make sure that the code isn't expecting to palloc
| anything first-time-through and re-use it on later loops, but I
| think that will be okay. (The attribute_buf is the most obvious
| risk, but that's all right, see stringinfo.c.)

So I /can't/ palloc some things /before/ switching context to
per-tuple-context? I ask because I'm palloc'ing a couple of
arrays, that would have to be MaxHeapAttributeNumber long to
make sure we've enough space. Though, thinking about it, an
additional 13k of static storage in the binary is not all that
much.

thanks.
brent

--
"Develop your talent, man, and leave the world something. Records are
really gifts from people. To think that an artist would love you enough
to share his music with anyone is a beautiful thing." -- Duane Allman

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-01-14 05:41:11 Re: Problem reloading regression database
Previous Message Marc G. Fournier 2002-01-14 05:19:57 Re: Release time