From: | Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Marc Cousin <marc(dot)cousin(at)dalibo(dot)com> |
Subject: | Re: Memory leak in GIN index build |
Date: | 2016-04-16 21:56:25 |
Message-ID: | 5712B509.7040403@dalibo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 16/04/2016 20:45, Tom Lane wrote:
> Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com> writes:
>
>> Also, in dataPlaceToPageLeaf() and ginVacuumPostingTreeLeaf(), shouldn't
>> the START_CRIT_SECTION() calls be placed before the xlog code?
>
> Yeah, they should. Evidently somebody kluged it to avoid doing a palloc
> inside a critical section, while ignoring every other rule about where to
> place critical sections. (Maybe we should put an assert about being
> within a critical section into XLogBeginInsert.)
>
After a quick test, it appears that at least log_smgrcreate() calls
XLogBeginInsert() without being in a critical section, from the various
DDL commands.
> This code is pretty fundamentally broken, isn't it. elog() calls
> inside a critical section? Really? Even worse, a MemoryContextDelete,
> which hardly seems like something that should be assumed error-proof.
>
> Not to mention the unbelievable ugliness of a function that sometimes
> returns with an open critical section (and WAL insertion started) and
> sometimes doesn't.
>
> It kinda looks like this was hacked up without bothering to keep the
> comment block in ginPlaceToPage in sync with reality, either.
>
> I think this needs to be redesigned so that the critical section and WAL
> insertion calls all happen within a single straight-line piece of code.
>
> We could try making that place be ginPlaceToPage(). So far as
> registerLeafRecompressWALData is concerned, that does not look that hard;
> it could palloc and fill the WAL-data buffer the same as it's doing now,
> then pass it back up to be registered (and eventually pfreed) in
> ginPlaceToPage. However, that would also mean postponing the call of
> dataPlaceToPageLeafRecompress(), which seems much messier. I assume
> the data it's working from is mostly in the tmpCtx that
> dataPlaceToPageLeaf wants to free before returning. Maybe we could
> move creation/destruction of the tmpCtx out to ginPlaceToPage, as well?
>
> The other line of thought is to move the WAL work that ginPlaceToPage
> does down into the subroutines. That would probably result in some
> duplication of code, but it might end up being a cleaner solution.
>
> Anyway, I think memory leakage is just the start of what's broken here,
> and fixing it won't be a very small patch.
>
I'm not really confident, but I'll give a try. Probably with your
second solution which looks easier.
Any pointer is welcome, unless someone more familiar with this code
wants to take it.
--
Julien Rouhaud
http://dalibo.com - http://dalibo.org
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2016-04-16 22:04:39 | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < |
Previous Message | Tom Lane | 2016-04-16 21:52:44 | Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < |