Taking into account syncrep position in flush_lsn reported by apply worker

From: Arseny Sher <ars(at)neon(dot)tech>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Taking into account syncrep position in flush_lsn reported by apply worker
Date: 2024-08-12 09:41:55
Message-ID: 2e874ff7-3aa8-47b4-9a3f-890ab47da581@neon.tech
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hey. Currently synchronous_commit is by default disabled for logical
apply worker on the
ground that reported flush_lsn includes only locally flushed data so slot
(publisher) preserves everything higher than this, and so in case of
subscriber
restart no data is lost. However, imagine that subscriber is made highly
available by standby to which synchronous replication is enabled. Then
reported
flush_lsn is ignorant of this synchronous replication progress, and in
case of
failover data loss may occur if subscriber managed to ack flush_lsn ahead of
syncrep. Moreover, it is almost silent due to this

    else if (start_lsn < slot->data.confirmed_flush)
    {
        /*
         * It might seem like we should error out in this case, but it's
         * pretty common for a client to acknowledge a LSN it doesn't
have to
         * do anything for, and thus didn't store persistently, because the
         * xlog records didn't result in anything relevant for logical
         * decoding. Clients have to be able to do that to support
synchronous
         * replication.
         *
         * Starting at a different LSN than requested might not catch
certain
         * kinds of client errors; so the client may wish to check that
         * confirmed_flush_lsn matches its expectations.
         */
        elog(LOG, "%X/%X has been already streamed, forwarding to %X/%X",
             LSN_FORMAT_ARGS(start_lsn),
             LSN_FORMAT_ARGS(slot->data.confirmed_flush));

        start_lsn = slot->data.confirmed_flush;
    }

in logical.c

Attached draft patch fixes this by taking into account syncrep progress in

worker reporting.

-- cheers, arseny

Attachment Content-Type Size
apply_worker_report_syncrep_pos.patch text/x-patch 1.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-08-12 09:55:00 Re: Refactoring postmaster's code to cleanup after child exit
Previous Message jian he 2024-08-12 09:20:00 Re: Detailed release notes