Re: pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Ben Chobot <bench(at)silentmedia(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql
Date: 2017-10-05 23:01:05
Message-ID: 20171005230105.qv2yrtadbkbc5y5r@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2017-10-05 17:43:39 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2017-10-05 11:38:40 -0400, Tom Lane wrote:
> >> I wonder why ReorderBufferCommit does this:
> >> ...
>
> > I'm not sure what the problem is that you're seeing?
>
> Nah, I take that back. The AbortCurrentTransaction call looks funny
> (and is sadly underdocumented) but it's not invalid to call it and
> then call RollbackAndReleaseCurrentSubTransaction.

It's not perfectly documented, but there's some:
/*
* Decoding needs access to syscaches et al., which in turn use
* heavyweight locks and such. Thus we need to have enough state around to
* keep track of those. The easiest way is to simply use a transaction
* internally. That also allows us to easily enforce that nothing writes
* to the database by checking for xid assignments.
*
* When we're called via the SQL SRF there's already a transaction
* started, so start an explicit subtransaction there.
*/
...
/*
* Aborting the current (sub-)transaction as a whole has the right
* semantics. We want all locks acquired in here to be released, not
* reassigned to the parent and we do not want any database access
* have persistent effects.
*/

> 2. The "MemoryContextResetAndDeleteChildren(_SPI_current->execCxt)"
> call, further up, is deleting a still-live executor state tree,
> as well as the logical-decoding context that is a child of that
> executor query context. So if you get past the Assert you'll still
> crash later on.

Right. That's why just adding found && to the entire if "works", as it
avoids that part as well.

> I'm not sure about a good way to fix #2 without re-introducing the memory
> leaks that call was added to fix (cf 7ec1c5a86). It's slightly annoying
> at this point that we got rid of the SPI_push/SPI_pop mechanism, because
> we could perhaps have used a check of the _SPI_curid value to distinguish
> a SPI context we should clean up from one we shouldn't. But that ship has
> sailed, and even if we wanted to undo that change there'd still be the
> matter of how to fix the bugs that prompted removing SPI_push/SPI_pop.

I think I don't fully understand what 7ec1c5a86 is trying to
achieve. Unfortunately reading the commit message and comment hasn't
cleared it up much so far. Why do we want to clean up memory in a
subtransaction that's above the one aborted? I can't yet meaningfully
comment on your proposals before fully understanding, sorry.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message ropeladder 2017-10-05 23:03:21 BUG #14843: CREATE TABLE churns through all memory, crashes db
Previous Message Tom Lane 2017-10-05 21:43:39 Re: pg_logical_slot_peek_changes crashes postgres when called from inside pl/pgsql