From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, fabriziomello(at)gmail(dot)com, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Minimal logical decoding on standbys |
Date: | 2023-04-07 15:47:57 |
Message-ID: | 20230407154757.ywqnldz4nsycap3g@awork3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2023-04-07 17:13:13 +0200, Drouvot, Bertrand wrote:
> On 4/7/23 9:50 AM, Andres Freund wrote:
> > I added a check for !invalidated to
> > ReplicationSlotsComputeRequiredLSN() etc.
> >
>
> looked at 65-0001 and it looks good to me.
>
> > Added new patch moving checks for invalid logical slots into
> > CreateDecodingContext(). Otherwise we end up with 5 or so checks, which makes
> > no sense. As far as I can tell the old message in
> > pg_logical_slot_get_changes_guts() was bogus, one couldn't get there having
> > "never previously reserved WAL"
> >
>
> looked at 65-0002 and it looks good to me.
>
> > Split "Handle logical slot conflicts on standby." into two. I'm not sure that
> > should stay that way, but it made it easier to hack on
> > InvalidateObsoleteReplicationSlots.
> >
>
> looked at 65-0003 and the others.
Thanks for checking!
> > Todo:
> > - write a test that invalidated logical slots stay invalidated across a restart
>
> Done in 65-66-0008 attached.
Cool.
> > - write a test that invalidated logical slots do not lead to retaining WAL
>
> I'm not sure how to do that since pg_switch_wal() and friends can't be executed on
> a standby.
You can do it on the primary and wait for the records to have been applied.
> > - Further evolve the API of InvalidateObsoleteReplicationSlots()
> > - pass in the ReplicationSlotInvalidationCause we're trying to conflict on?
> > - rename xid to snapshotConflictHorizon, that'd be more in line with the
> > ResolveRecoveryConflictWithSnapshot and easier to understand, I think
> >
>
> Done. The new API can be found in v65-66-InvalidateObsoleteReplicationSlots_API.patch
> attached. It propagates the cause to InvalidatePossiblyObsoleteSlot() where a switch/case
> can now be used.
Integrated. I moved the cause to the first argument, makes more sense to me
that way.
> The "default" case does not emit an error since this code runs as part
> of checkpoint.
I made it an error - it's a programming error, not some data level
inconsistency if that ever happens.
> > - The test could stand a bit of cleanup and consolidation
> > - No need to start 4 psql processes to do 4 updates, just do it in one
> > safe_psql()
>
> Right, done in v65-66-0008-New-TAP-test-for-logical-decoding-on-standby.patch attached.
> > - the sequence of drop_logical_slots(), create_logical_slots(),
> > change_hot_standby_feedback_and_wait_for_xmins(), make_slot_active() is
> > repeated quite a few times
>
> grouped in reactive_slots_change_hfs_and_wait_for_xmins() in 65-66-0008 attached.
>
> > - the stats queries checking for specific conflict counts, including
> > preceding tests, is pretty painful. I suggest to reset the stats at the
> > end of the test instead (likely also do the drop_logical_slot() there).
>
> Good idea, done in 65-66-0008 attached.
>
> > - it's hard to correlate postgres log and the tap test, because the slots
> > are named the same across all tests. Perhaps they could have a per-test
> > prefix?
>
> Good point. Done in 65-66-0008 attached. Thanks to that and the stats reset the
> check for invalidation is now done in a single function "check_for_invalidation" that looks
> for invalidation messages in the logfile and in pg_stat_database_conflicts.
>
> Thanks for the suggestions: the TAP test is now easier to read/understand.
Integrated all of these.
I think pg_log_standby_snapshot() should be added in "Allow logical decoding
on standby", not the commit adding the tests.
Is this patchset sufficient to subscribe to a publication on a physical
standby, assuming the publication is created on the primary? If so, we should
have at least a minimal test. If not, we should note that restriction
explicitly.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-04-07 15:52:37 | Re: Making background psql nicer to use in tap tests |
Previous Message | Tom Lane | 2023-04-07 15:47:54 | Re: [PATCH] Introduce array_shuffle() and array_sample() |