Re: Catalog_xmin is not advanced when a logical slot is lost

From: sirisha chamarthi <sirichamarthi22(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Subject: Re: Catalog_xmin is not advanced when a logical slot is lost
Date: 2022-11-21 18:48:55
Message-ID: CAKrAKeUCcrTqdapRcZ=fO5sYTej9XWopvXHQUZNRjJNQQTQxBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 21, 2022 at 10:40 AM sirisha chamarthi <
sirichamarthi22(at)gmail(dot)com> wrote:

>
>
> On Mon, Nov 21, 2022 at 10:11 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> wrote:
>
>> On 2022-Nov-21, sirisha chamarthi wrote:
>>
>> > It appears to be. wal_sender is setting restart_lsn to a valid LSN even
>> > when the slot is invalidated.
>>
>> > postgres(at)pgvm:~$ /usr/local/pgsql/bin/pg_receivewal -S s1 -D .
>> > pg_receivewal: error: unexpected termination of replication stream:
>> ERROR:
>> > requested WAL segment 0000000100000000000000EB has already been removed
>> > pg_receivewal: disconnected; waiting 5 seconds to try again
>> > ^Cpostgres(at)pgvm:~$ /usr/local/pgsql/bin/psql
>> > psql (16devel)
>> > Type "help" for help.
>> >
>> > postgres=# select * from pg_replication_slots;
>> > server closed the connection unexpectedly
>> > This probably means the server terminated abnormally
>> > before or while processing the request.
>>
>> Whoa, I cannot reproduce this :-(
>>
>
> I have a old .partial file in the data directory to reproduce this.
>
> postgres=# select * from pg_replication_slots;
> slot_name | plugin | slot_type | datoid | database | temporary | active |
> active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn |
> wal_status | safe_wal_size | two_phase
>
> -----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
> s2 | | physical | | | f | f |
> | | | 2/DC000000 | | lost
> | | f
> (1 row)
>
> postgres=# \q
> postgres(at)pgvm:~$ ls
> 0000000100000002000000D8 0000000100000002000000D9
> 0000000100000002000000DA 0000000100000002000000DB
> 0000000100000002000000DC.partial
>

Just to be clear, it was hitting the assert I added in the slotfuncs.c but
not in the code you mentioned. Apologies for the confusion. Also it appears
in the above case I mentioned, the slot is not invalidated yet as the
checkpointer did not run though the state says it is lost.

>
>
>>
>> --
>> Álvaro Herrera Breisgau, Deutschland —
>> https://www.EnterpriseDB.com/
>> "Java is clearly an example of money oriented programming" (A. Stepanov)
>>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2022-11-21 18:56:39 Re: Catalog_xmin is not advanced when a logical slot is lost
Previous Message Simon Riggs 2022-11-21 18:44:17 Re: Damage control for planner's get_actual_variable_endpoint() runaway