Re: Issue with logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kacey Holston <kacey(dot)holston(at)pgexperts(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Issue with logical replication
Date: 2021-08-09 08:58:12
Message-ID: CAA4eK1KA-Ld5n+6ir_iC7Su7z2J001iyOvzZyOmWbqj=c+Fm6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Aug 9, 2021 at 1:58 AM Kacey Holston
<kacey(dot)holston(at)pgexperts(dot)com> wrote:
>
> I ran into an issue of missing rows during an upgrade from PostgreSQL 11 to PostgreSQL 13 using in core logical replication.
>
> ## Procedure Used:
>
> After copying over all the users and schema I ran this on the publisher:
>
> "CREATE PUBLICATION upgrade11to13 FOR ALL TABLES;"
>
> Then ran this on the subscriber:
>
> "CREATE SUBSCRIPTION server13 CONNECTION ‘dbname=<DBNAME redacted> hostaddr=<ADDRESS redacted > user=<USER redacted > password=<PASSWORD redacted > port=5432' PUBLICATION upgrade11to13;"
>
> The initial copy and replication appeared to run as expected. I saw all the tables in the ready state in pg_subscription_rel. The tables pg_stat_repllication and pg_stat_subscription looked as expected as well.
>
> We then reversed the direction of replication by running this on the PostgreSQL 13 server:
>
> "DROP SUBSCRIPTION server13;”
>
> And this on the PostgreSQL 11 server:
>
> "DROP PUBLICATION upgrade11to13;”
>
> Back on the PostgreSQL 13 server:
>
> "CREATE PUBLICATION reverse_to_11 FOR ALL TABLES;"
>
> And on the PostgreSQL 11 server:
>
> "CREATE SUBSCRIPTION oldserver11 CONNECTION 'dbname=<DBNAME redacted> hostaddr=<ADDRESS redacted > user=<USER redacted > password=<PASSWORD redacted > port=5432' PUBLICATION reverse_to_11 with (copy_data=false);"
>
..
>
> We were able to note that the data loss appeared to coincide approximately with the start of the table’s copy but the data loss ended before the table finished the copy. The table took about 12 hours to copy while the data loss was only three hours.
>

It is not very clear from the above description where exactly the data
was missing on the PG-11 server or PG-13 server. If it is missing from
PG-11, then how is it related to the initial copy because you
mentioned (copy_data=false) while creating a subscription on 11 so it
should not perform initial sync?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bruce Momjian 2021-08-09 17:22:40 Re: BUG #17137: Bug with datatypes definition in CTE
Previous Message Kacey Holston 2021-08-08 20:28:28 Issue with logical replication