Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4
Date: 2024-11-13 11:55:01
Message-ID: 7c41cd48-901e-49b7-851e-e7bdbcf84f4b@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/13/24 05:38, Ashutosh Bapat wrote:
> ...
>
> Here's way we can fix SnapBuildProcessRunningXacts() similar to
> DecodeCommit(). DecodeCommit() uses SnapBuildXactNeedsSkip() to decide
> whether a given transaction should be decoded or not.
> /*
> * Should the contents of transaction ending at 'ptr' be decoded?
> */
> bool
> SnapBuildXactNeedsSkip(SnapBuild *builder, XLogRecPtr ptr)
> {
> return ptr < builder->start_decoding_at;
> }
>
> Similar to SnapBuild::start_decoding_at we could maintain a field
> SnapBuild::start_reading_at to the LSN from which the WAL sender would
> start reading WAL. If candidate_restart_lsn produced by a running
> transactions WAL record is less than SnapBuild::start_reading_at,
> SnapBuildProcessRunningXacts() won't call
> LogicalIncreaseRestartDecodingForSlot() with that candiate LSN. We
> won't access the slot here and the solution will be inline with
> DecodeCommit() which skips the transactions.
>

Could you maybe write a patch doing this? That would allow proper
testing etc.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Vanns 2024-11-13 12:08:40 BitmapOr node not used in plan for ANY/IN but is for sequence of ORs ...
Previous Message Tomas Vondra 2024-11-13 11:53:34 Re: logical replication: restart_lsn can go backwards (and more), seems broken since 9.4