Quick Links

Re: Replication origins and timelines

From:	Craig Ringer <craig(at)2ndquadrant(dot)com>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Replication origins and timelines
Date:	2017-06-01 01:56:15
Message-ID:	CAMsr+YGVVqF2F6udpfxS9Xhfc+QXzHWEpQiZhXAUyo8QhMz4SQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 1 June 2017 at 09:23, Andres Freund <andres(at)anarazel(dot)de> wrote:

> Even if we decide this is necessary, I *strongly* suggest trying to get
> the existing standby decoding etc wrapped up before starting something
> nontrival afresh.

Speaking of such, I had a thought about how to sync logical slot state
to physical replicas without requiring all logical downstreams to know
about and be able to connect to all physical replicas. Interested in
your initial reaction. Basically, enlist the walreceiver's help.

Extend the walsender so in physical rep mode it periodically writes
the upstream's logical slot state into the stream as a message with
special lsn 0/0. Then the walreceiver uses that to make decoding calls
on the downstream to advance the downstream logical slots to the new
confirmed_flush_lsn, or hands the info off to a helper proc that does
it. It could set up a decoding context and do it via a proper decoding
session, discarding output, and later we could probably optimise that
decoding session to do even less work than ReorderBufferSkip()ing
xacts.

The alternative at this point since we nixed writing logical slot
state to WAL seems to be a bgworker on the upstream that periodically
writes logical slot state into generic WAL messages in a special
table, then another on the downstream that processes the table and
makes appropriate decoding calls to advance the downstream slot state.
(Safely, not via directly setting catalog_xmin etc). Which is pretty
damn ugly, but has the advantage that it'd work for PITR restores,
unlike the walsender/walreceiver based approach. Failover slots in
extension-space, basically.

I'm really, really not sold on all logical downstreams having to know
about and be able to connect to all physical standbys of the upstream
to maintain slots on them. Some kind of solution that runs entirely on
the standby will be needed. It's more a question of whether it's
something built-in, easy, and nice, or some out of tree extension.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: Replication origins and timelines at 2017-06-01 01:23:25 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Dilger	2017-06-01 02:13:11	Re: pg_class.relpartbound definition overly brittle
Previous Message	Amit Langote	2017-06-01 01:53:23	Re: "create publication..all tables" ignore 'partition not supported' error