Re: BUG #18789: logical replication slots are deleted after failovers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: sachinkonde3(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18789: logical replication slots are deleted after failovers
Date: 2025-02-01 05:35:44
Message-ID: CAA4eK1LB5a-923n7NGd5FcQaEe=0WmxLvKNU+PdiiskZoqAm-A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, Jan 30, 2025 at 12:44 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Jan 29, 2025 at 7:01 AM PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
> >
> > The following bug has been logged on the website:
> >
> > Bug reference: 18789
> > Logged by: Sachin Konde-Deshmukh
> > Email address: sachinkonde3(at)gmail(dot)com
> > PostgreSQL version: 17.2
> > Operating system: Oracle Linux 8.9
> > Description:
> >
> > We are using 2 node PostgreSQL 17 HA setup using Patroni 4.0.4.
> > When I do failover 2nd or third time or more than once, it fails to transfer
> > or move logical replication slot to new Primary.
> > postgres=# select slot_name,slot_type, failover,
> > synced,confirmed_flush_lsn,active from pg_replication_slots;
> > slot_name | slot_type | failover | synced | confirmed_flush_lsn |
> > active
> > --------------------+-----------+----------+--------+---------------------+--------
> > psoel89pgcluster01 | physical | f | f | |
> > t
> > mysub | logical | t | t | 0/4000AB8 |
> > t
> > (2 rows)
>
> I guess that this is the list of slots on the primary.
>
> > After First Failover -->
> > postgres=# select slot_name,slot_type, failover,
> > synced,confirmed_flush_lsn,active from pg_replication_slots;
> > slot_name | slot_type | failover | synced | confirmed_flush_lsn |
> > active
> > --------------------+-----------+----------+--------+---------------------+--------
> > psoel89pgcluster02 | physical | f | f | |
> > t
> > mysub | logical | f | f | 0/50001E0 |
> > t
> > (2 rows)
>
> I guess that this is the list of slots on the new primary after a
> failover.
>

This data appears suspicious to me because we shouldn't be changing
the 'failover' property of the slot. Also, note that physical slot
names are different in the above two results. Either the user has
changed the 'failover' property of the slot or the above data is from
completely two different clusters one where the 'failover' property
for the slot is enabled and another where it is not enabled.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Chris BSomething 2025-02-02 19:48:35 Re: Bug in psql
Previous Message PG Bug reporting form 2025-02-01 01:45:10 BUG #18791: Failed installtion during creation of spatial database