From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Assertion failure when streaming logical changes |
Date: | 2015-02-11 00:51:52 |
Message-ID: | 54DAA7A8.5000705@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/10/2015 02:46 PM, Andres Freund wrote:
> On 2015-02-10 21:20:37 +0900, Michael Paquier wrote:
>> Using test_decoding on HEAD (cc761b1) I am seeing the following assertion
>> failure:
>> TRAP: FailedAssertion("!(!((&RegisteredSnapshots)->ph_root ==
>> ((void*)0)))", File: "snapmgr.c", Line: 677)
>
> Ick. I guess a revert of 94028691609f8e148bd4ce72c46163f018832a5b fixes
> it?
> ...
>
> Yea, it really looks like the above commit is to "blame". The new xmin
> tracking infrastructure doesn't know about the historical snapshot...
The new xmin tracking code assumes that if a snapshots's regd_count > 0,
it has already been pushed to the RegisteredSnapshots heap. That
assumption doesn't hold because the logical decoding stuff creates its
own snapshots and sets regd_count=1 to prevent snapmgr.c from freeing
them, even though they haven't been registered with RegisterSnapshot.
The second paragraph in this comment in snapmgr.c needs fixing:
> * Likewise, any snapshots that have been exported by pg_export_snapshot
> * have regd_count = 1 and are counted in RegisteredSnapshots, but are not
> * tracked by any resource owner.
> *
> * The same is true for historic snapshots used during logical decoding,
> * their lifetime is managed separately (as they life longer as one xact.c
> * transaction).
Besides the spelling, that's incorrect: historic snapshots are *not*
counted in RegisteredSnapshots. That was harmless before commit
94028691, but it was always wrong.
I think setting regd_count=1 outside snapmgr.c is a pretty ugly hack.
snapbuild.c also abuses active_count as a reference counter, which is
similarly ugly. I think regd_count and active_count should both be
treated as private to snapmgr.c, and initialized to zero elsewhere.
As a minimal fix, we could change the logical decoding code to not use
regd_count to prevent snapmgr.c from freeing its snapshots, but use
active_count for that too. Setting regd_count to 1 in
SnapBuildBuildSnapshot() seems unnecessary in the first place, as the
callers also call SnapBuildSnapIncRefcount(). Patch attached.
But could we get rid of those active_count manipulations too? Could you
replace the SnapBuildSnap[Inc|Dec]Refcount calls with
[Un]RegisterSnapshot()?
- Heikki
Attachment | Content-Type | Size |
---|---|---|
minimal-fix-for-historic-snapshots-1.patch | application/x-patch | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Arthur Silva | 2015-02-11 01:14:56 | Re: reducing our reliance on MD5 |
Previous Message | Peter Geoghegan | 2015-02-11 00:32:20 | Re: reducing our reliance on MD5 |