Re: Replication slot WAL reservation

From: Phillip Diffley <phillip6402(at)gmail(dot)com>
To: Christophe Pettus <xof(at)thebuild(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Replication slot WAL reservation
Date: 2025-03-26 03:56:38
Message-ID: CAGAwPgQ1MgX2s4UgEspCwfKN_3okudzwh7NzpkoJZSA1HnGqYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> You shouldn't need to manually advance the replication slot.
> The client is also expected to send back regular messages letting the
publisher / primary know that it has successfully consumed up to a
particular point

I was thinking of these as the same thing, but it sounds like they are
different. At the moment, the only method I know for letting the
publisher/primary know what has been successfully consumed is
pg_replication_slot_advance. I looked at the message formats
<https://www.postgresql.org/docs/current/protocol-message-formats.html#PROTOCOL-MESSAGE-FORMATS>
and logical replication message formats
<https://www.postgresql.org/docs/current/protocol-logicalrep-message-formats.html>
pages, but I did not see a message type for updating confirmed_flush_lsn or
otherwise letting the publisher/primary know what logs have been
successfully consumed. There is the flush
<https://www.postgresql.org/docs/current/protocol-message-formats.html#PROTOCOL-MESSAGE-FORMATS-FLUSH>
message, but it looks like it only passes a 4 byte int instead of the 8
bytes required for an LSN. Is there a message type that is used to confirm
what logs have been successfully consumed?

> You know that any WAL generated after `confirmed_flush_lsn` is available
for replay.

The part I am uncertain about is what "after" means here, since LSNs are
not presented in order, and the order of data streamed over the replication
slot does not match the order of the data in the WAL.

I initially (and incorrectly) thought the confirmation order was based on
LSN. So if you confirmed an LSN "x" then all logs with LSN less than "x"
could be released by the publisher/primary. That can't work though since
LSNs are not presented in order by the replication slot. Is there a
monotonically increasing identifier that can be used to identify which logs
come "after" another? Or do you just keep track of the order the
replication slot delivers logs in and not confirm a log until it and all
the logs received before it are processed to the point of being crash-proof?

On Tue, Mar 25, 2025 at 4:32 PM Christophe Pettus <xof(at)thebuild(dot)com> wrote:

>
>
> > On Mar 25, 2025, at 13:58, Phillip Diffley <phillip6402(at)gmail(dot)com>
> wrote:
> >
> > Oh I see! I was conflating the data I see coming out of a replication
> slot with the internal organization of the WAL. I think the more specific
> question I am trying to answer is, as a consumer of a replication slot, how
> do I reason about what replication records will be made unavailable when I
> confirm an LSN? Here I am worried about situations where the replication
> connection is interrupted or the program processing the records crashes,
> and we need to replay records that may have been previously sent but were
> not fully processed.
>
> It's up to the consuming client to keep track of where it is in the WAL
> (using an LSN). When the client connects, it specifies what LSN to start
> streaming at. If that LSN is no longer available, the publisher / primary
> returns an error.
>
> The client shouldn't confirm the flush of an LSN unless it is crash-proof
> to that point, since any WAL before that should be assumed to be
> unavailable.
>
> > For example, are the records sent by a replication slot always sent in
> the same order such that if I advance the confirmed_flush_lsn of a slot to
> the LSN of record "A", I will know that any records that had been streamed
> after record "A" will be replayable?
>
> You know that any WAL generated after `confirmed_flush_lsn` is available
> for replay. That's the oldest LSN that the client can specify on
> connection (although it can specify a later one, if it exists). You
> shouldn't need to manually advance the replication slot. Instead, the
> client specifies where it wants to start when it connects. The client is
> also expected to send back regular messages letting the publisher / primary
> know that it has successfully consumed up to a particular point in the WAL,
> so the publisher / primary knows it can release that WAL information.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christophe Pettus 2025-03-26 04:32:26 Re: Replication slot WAL reservation
Previous Message Sami Imseih 2025-03-26 00:56:29 Re: query_id: jumble names of temp tables for better pg_stat_statement UX