Re: PATCH: decreasing memory needlessly consumed by array_agg

From: Ali Akbar <the(dot)apaan(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: decreasing memory needlessly consumed by array_agg
Date: 2015-01-12 00:28:42
Message-ID: CACQjQLrkEGaXPw2iWzKvE4Lfh4CyMWH8z05wibe2=_RH9cZ+ww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > Or else we implement what you suggest below (more comments below):
> >
> > Thinking about the 'release' flag a bit more - maybe we could do
> > this
> > instead:
> >
> > if (release && astate->private_cxt)
> > MemoryContextDelete(astate->mcontext);
> > else if (release)
> > {
> > pfree(astate->dvalues);
> > pfree(astate->dnulls);
> > pfree(astate);
> > }
> >
> > i.e. either destroy the whole context if possible, and just free
> the
> > memory when using a shared memory context. But I'm afraid this
> would
> > penalize the shared memory context, because that's intended for
> > cases where all the build states coexist in parallel and then at
> some
> > point are all converted into a result and thrown away. Adding
> pfree()
> > calls is no improvement here, and just wastes cycles.
> >
> >
> > As per Tom's comment, i'm using "parent memory context" instead of
> > "shared memory context" below.
> >
> > In the future, if some code writer decided to use subcontext=false,
> > to save memory in cases where there are many array accumulation, and
> > the parent memory context is long-living, current code can cause
> > memory leak. So i think we should implement your suggestion
> > (pfreeing astate), and warn the implication in the API comment. The
> > API user must choose between release=true, wasting cycles but
> > preventing memory leak, or managing memory from the parent memory
> > context.
>
> I'm wondering whether this is necessary after fixing makeArrayResult to
> use the privat_cxt flag. It's still possible to call makeMdArrayResult
> directly (with the wrong 'release' value).
>
> Another option might be to get rid of the 'release' flag altogether, and
> just use the 'private_cxt' - I'm not aware of a code using release=false
> with private_cxt=true (e.g. to build the same array twice from the same
> astate). But maybe there's such code, and another downside is that it'd
> break the existing API.
>
> > In one possible use case, for efficiency maybe the caller will
> > create a special parent memory context for all array accumulation.
> > Then uses makeArrayResult* with release=false, and in the end
> > releasing the parent memory context once for all.
>
> Yeah, although I'd much rather not break the existing code at all. That
> is - my goal is not to make it slower unless absolutely necessary (and
> in that case the code may be fixed per your suggestion). But I'm not
> convinced it's worth it.
>

OK. Do you think we need to note this in the comments? Something like this:
If using subcontext=false, the caller must be careful about memory usage,
because makeArrayResult* will not free the memory used.

> But I think it makes sense to move the error handling into
> initArrayResultArr(), including the get_element_type() call, and remove
> the element_type from the signature. This means initArrayResultAny()
> will call the get_element_type() twice, but I guess that's negligible.
> And everyone who calls initArrayResultArr() will get the error handling
> for free.
>
> Patch v7 attached, implementing those two changes, i.e.
>
> * makeMdArrayResult(..., astate->private_cxt)
> * move error handling into initArrayResultArr()
> * remove element_type from initArrayResultArr() signature
>

Reviewing the v7 patch:
- applies cleanly to current master. patch format, whitespace, etc is good
- make check runs without error
- performance & memory usage still consistent

If you think we don't have to add the comment (see above), i'll mark this
as ready for committer

Regards,
--
Ali Akbar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-01-12 00:37:28 Re: Using 128-bit integers for sum, avg and statistics aggregates
Previous Message Andres Freund 2015-01-11 23:53:55 Re: Escaping from blocked send() reprised.