From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Restrict copying of invalidated replication slots |
Date: | 2025-02-25 10:36:32 |
Message-ID: | CAA4eK1JjPx2vVDrEBQFKgMmq6uK20rUVevSeyJ=UVmHvRrEEjw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Feb 25, 2025 at 1:03 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> I've checked if this issue exists also on v15 or older, but IIUC it
> doesn't exist, fortunately. Here is the summary:
>
> Scenario-1: the source gets invalidated before the copy function
> fetches its contents for the first time. In this case, since the
> source slot's restart_lsn is already an invalid LSN it raises an error
> appropriately. In v15, we have only one slot invaldation reason, WAL
> removal, therefore we always reset the slot's restart_lsn to
> InvalidXlogRecPtr.
>
> Scenario-2: the source gets invalidated before the copied slot is
> created (i.e., between first content copy and
> create_logical/physical_replication_slot()). In this case, the copied
> slot could have a valid restart_lsn value that however might point to
> a WAL segment that might have already been removed. However, since
> copy_restart_lsn will be an invalid LSN (=0), it's caught by the
> following if condition:
>
> if (copy_restart_lsn < src_restart_lsn ||
> src_islogical != copy_islogical ||
> strcmp(copy_name, NameStr(*src_name)) != 0)
> ereport(ERROR,
> (errmsg("could not copy replication slot \"%s\"",
> NameStr(*src_name)),
> errdetail("The source replication slot was
> modified incompatibly during the copy operation.")));
>
> Scenario-3: the source gets invalidated after creating the copied slot
> (i.e. after create_logical/physical_replication_slot()). In this case,
> since the newly copied slot have the same restart_lsn as the source
> slot, both slots are invalidated.
>
Which part of the code will cover Scenario-3? Shouldn't we give ERROR
for Scenario-3 as well?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2025-02-25 10:42:10 | Re: long-standing data loss bug in initial sync of logical replication |
Previous Message | Benoit Lobréau | 2025-02-25 10:26:29 | Re: long-standing data loss bug in initial sync of logical replication |