pgsql: Fix consistency issues with replication slot copy

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: pgsql: Fix consistency issues with replication slot copy
Date: 2020-03-17 19:22:17
Message-ID: E1jEHn7-0000pM-SZ@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix consistency issues with replication slot copy

Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out. But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.

Also ensure the source slot's confirmed_flush_lsn is valid.

Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/bcd1c3630095e48bc3b1eb0fc8e8c8a7c851eba1

Modified Files
--------------
src/backend/replication/logical/logical.c | 2 ++
src/backend/replication/slotfuncs.c | 46 ++++++++++++++++++++++++++-----
2 files changed, 41 insertions(+), 7 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Sergei Kornilov 2020-03-17 20:39:11 Re: pgsql: walreceiver uses a temporary replication slot by default
Previous Message Tom Lane 2020-03-17 19:05:36 pgsql: Doc: clarify behavior of "anyrange" pseudo-type.