Question about StartLogicalReplication() error path

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <rhaas(at)postgresql(dot)org>
Subject: Question about StartLogicalReplication() error path
Date: 2021-06-11 02:08:10
Message-ID: c5c861d576f2511732f8002c76245da587110b1c.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

StartLogicalReplication() calls CreateDecodingContext(), which says:

else if (start_lsn < slot->data.confirmed_flush)
{
/*
* It might seem like we should error out in this case, but it's
* pretty common for a client to acknowledge a LSN it doesn't
have to
* do anything for, and thus didn't store persistently, because the
* xlog records didn't result in anything relevant for logical
* decoding. Clients have to be able to do that to support
synchronous
* replication.
*/
...
start_lsn = slot->data.confirmed_flush;
}

But what about LSNs that are way in the past? Physical replication will
throw an error in that case (e.g. "requested WAL segment %s has already
been removed"), but StartLogicalReplication() ends up just starting
from confirmed_flush, which doesn't seem right.

I'm not sure I understand the comment overall. Why would the client
request something that it has already acknowledged, and why would the
server override that and just advance to the confirmed_lsn?

Regards,
Jeff Davis

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2021-06-11 02:14:32 Re: Patch: Range Merge Join
Previous Message osumi.takamichi@fujitsu.com 2021-06-11 02:00:27 timing sensitive failure of test_decoding RT that exists in PG9.6