Re: Tables getting stuck at 's' state during logical replication

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Padmavathi G <padma9(dot)9(dot)1999(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Tables getting stuck at 's' state during logical replication
Date: 2023-05-10 13:33:59
Message-ID: CALj2ACVG7bjgEmFiAs+8ZEz4wBm64VGMVYSyLGwv5zKvXcSEmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 9, 2023 at 4:38 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, May 5, 2023 at 7:27 PM Padmavathi G <padma9(dot)9(dot)1999(at)gmail(dot)com> wrote:
> >
> > Some background on the setup on which I am trying to carry out the upgrade:
> >
> > We have a pod in a kubernetes cluster which contains the postgres 11 image. We are following the logical replication process for upgrade
> >
> > Steps followed for logical replication:
> >
> > 1. Created a new pod in the same kubernetes cluster with the latest postgres 15 image
> > 2. Created a publication (say publication 1) in the old pod including all tables in a database
> > 3. Created a subscription (say subscription 1) in the new pod for the above mentioned publication
> > 4. When monitoring the subscription via pg_subscription_rel in the subscriber, I noticed that out of 45 tables 20 were in the 'r' state and 25 were in 's' state and they remained in the same state for almost 2 days, there was no improvement in the state. But the logs showed that the tables which had 's' state also had "synchronization workers for <table_name> finished".
> >
>
> I think the problem happened in this step where some of the tables
> remained in 's' state. It is not clear why this could happen because
> apply worker should eventually update these relations state to 'r'. If
> this is reproducible, we can try two things to further investigate the
> issue: (a) Disable and Enable the subscription once and see if that
> helps; (b) try increasing the LOG level to DEBUG2 and see if we get
> any useful LOG message.

It looks like the respective apply workers for the tables whose state
was 's' are not able to get to UpdateSubscriptionRelState with
SUBREL_STATE_READY. In addition to what Amit suggested above, is it
possible for you to reproduce this problem on upstream (reproducing on
HEAD cool otherwise PG 15 enough) code? If yes, you can add custom
debug messages to see what's happening.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Xiaoran Wang 2023-05-10 13:48:36 Re: [PATCH] Use RelationClose rather than table_close in heap_create_with_catalog
Previous Message Andrew Dunstan 2023-05-10 13:24:39 Re: [PATCH] Add native windows on arm64 support