From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Gurjeet Singh <gurjeet(at)singh(dot)im> |
Cc: | "Duran, Danilo" <danilod(at)amazon(dot)com>, "robertmhaas(at)gmail(dot)com" <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: replication slot restart_lsn initialization |
Date: | 2015-06-10 15:07:38 |
Message-ID: | 20150610150738.GD10551@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2015-06-10 08:00:28 -0700, Gurjeet Singh wrote:
> Attached is the patch that takes the former approach (initialize
> restart_lsn when the slot is created).
If it's an option that's imo a sane approach.
> pg_create_logical_replication_slot() prevents LSN from being
> recycled that by looping (worst case 2 times) until there's no
> conflict with the checkpointer recycling the segment. So I have used
> the same approach.
There's no need to change anything for logical slots? Or do you just
mean that you moved the existing code?
>
> /*
> + * Grab and save an LSN value to prevent WAL recycling past that point.
> + */
> +void
> +ReplicationSlotRegisterRestartLSN()
> +{
> + ReplicationSlot *slot = MyReplicationSlot;
> +
> + Assert(slot != NULL);
> + Assert(slot->data.restart_lsn == InvalidXLogRecPtr);
> +
> + /*
> + * The replication slot mechanism is used to prevent removal of required
> + * WAL. As there is no interlock between this and checkpoints required, WAL
> + * segment could be removed before ReplicationSlotsComputeRequiredLSN() has
> + * been called to prevent that. In the very unlikely case that this happens
> + * we'll just retry.
> + */
> + while (true)
> + {
> + XLogSegNo segno;
> +
> + /*
> + * Let's start with enough information if we can, so log a standby
> + * snapshot and start decoding at exactly that position.
> + */
> + if (!RecoveryInProgress())
> + {
> + XLogRecPtr flushptr;
> +
> + /* start at current insert position */
> + slot->data.restart_lsn = GetXLogInsertRecPtr();
> +
> + /* make sure we have enough information to start */
> + flushptr = LogStandbySnapshot();
> +
> + /* and make sure it's fsynced to disk */
> + XLogFlush(flushptr);
> + }
> + else
> + slot->data.restart_lsn = GetRedoRecPtr();
> +
> + /* prevent WAL removal as fast as possible */
> + ReplicationSlotsComputeRequiredLSN();
> +
> + /*
> + * If all required WAL is still there, great, otherwise retry. The
> + * slot should prevent further removal of WAL, unless there's a
> + * concurrent ReplicationSlotsComputeRequiredLSN() after we've written
> + * the new restart_lsn above, so normally we should never need to loop
> + * more than twice.
> + */
> + XLByteToSeg(slot->data.restart_lsn, segno);
> + if (XLogGetLastRemovedSegno() < segno)
> + break;
> + }
> +}
That doesn't look right to me. Why is this code logging a standby
snapshot for physical slots?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Wieck | 2015-06-10 15:12:46 | Re: s_lock() seems too aggressive for machines with many sockets |
Previous Message | Nils Goroll | 2015-06-10 15:06:19 | Re: s_lock() seems too aggressive for machines with many sockets |