Re: Excessive number of replication slots for 12->14 logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Bowen Shi <zxwsbg12138(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Hubert Lubaczewski <depesz(at)depesz(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Excessive number of replication slots for 12->14 logical replication
Date: 2024-01-19 16:40:06
Message-ID: CALDaNm07QRB+F2aq5hvb-MoHrBEWih0HdR0S1-n6PjP2e2SR8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, 18 Jan 2024 at 13:00, Bowen Shi <zxwsbg12138(at)gmail(dot)com> wrote:
>
> Dears,
>
> I encountered a similar problem when I used logical replication to replicate databases from pg 16 to pg 16.
>
> I started 3 subscription in parallel, and subscriber's postgresql.conf is following:
> max_replication_slots = 10
> max_sync_workers_per_subscription = 2
>
> However, after 3 minutes, I found three COPY errors in subscriber:
> "error while shutting down streaming COPY: ERROR: could not find record while sending logically-decoded data: missing contrecord at xxxx/xxxxxxxxx""
> Then, the subscriber began to print a large number of errors: "could not find free replication state slot for replication origin with ID 11, Increase max_replication_slots and try again."
>
> And the publisher was full of pg_xxx_sync_xxxxxxx slots, printing lots of "all replication slots are in use, Free one or increase max_replication_slots."
>
> This question is very similar to https://www.postgresql.org/message-id/flat/20220714115155.GA5439%40depesz.com . When the table sync worker encounters an error and exits while copying a table, the replication origin will not be deleted. And new table sync workers would create sync slot in the publisher and then exit without dropping them.

I had tried various tests with the suggested configuration, but I did
not hit this scenario. I was able to simulate this problem with a
lesser number of max_replication_slots, but the behavior is as
expected in this case.
If you have a test case or logs for this, can you share it please. It
will be easier to generate the sequence of things that is happening
and to project a clear picture of what is happening.

Regards,
Vignesh

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrey Borodin 2024-01-19 18:16:34 Re: [BUG] false positive in bt_index_check in case of short 4B varlena datum
Previous Message David G. Johnston 2024-01-19 14:28:22 Re: error when having sub statements with fields that do not exist