Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: sytoptimisprime(at)163(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.
Date: 2024-01-04 06:27:42
Message-ID: CAA4eK1LDhzMy69s-ZaAMMenZNHsdzCfgOEN=VY9enGFRPu10Xg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form
<noreply(at)postgresql(dot)org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 18267
> Logged by: song yutao
> Email address: sytoptimisprime(at)163(dot)com
> PostgreSQL version: 15.5
> Operating system: Linux
> Description:
>
> Hi hackers, I found when insert plenty of data into a table, and add the
> table to publication (through Alter Publication) meanwhile, it's likely that
> the incremental data cannot be synchronized to the subscriber. Here is my
> test method:
>
> 1. On publisher and subscriber, create table for test:
> CREATE TABLE tab_1 (a int);
>
> 2. Setup logical replication:
> on publisher:
> SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> false);
> CREATE PUBLICATION tap_pub;
> on subscriber:
> CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION
> tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
>
> 3. Perform Insert:
> for (my $i = 1; $i <= 1000; $i++) {
> $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> generate_series(1, 1000)");
> }
> Each transaction contains 1000 insertion, and 1000 transactions are in
> total.
>
> 4. When performing step 3, add table tab_1 to publication.
> ALTER PUBLICATION tap_pub ADD TABLE tab_1
> ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION
>
> The root cause of the problem is as follows:
> pgoutput relies on the invalidation mechanism to validate publications. When
> walsender decoding an Alter Publication transaction, catalog caches are
> invalidated at once. Furthermore, since pg_publication_rel is modified,
> snapshot changes are added to all transactions currently being decoded. For
> other transactions, catalog caches have been invalidated. However, it is
> likely that the snapshot changes have not yet been decoded. In pgoutput
> implementation, these transactions query the system table pg_publication_rel
> to determine whether to publish changes made in transactions. In this case,
> catalog tuples are not found because snapshot has not been updated. As a
> result, changes in transactions are considered not to be published, and
> subsequent data cannot be synchronized.
>

As per my understanding, we distribute snapshot to other transactions
at commit time (LSN) which means in your case at the time of commit
for "ALTER PUBLICATION tap_pub ADD TABLE tab_1". So any changes after
that should see the changes in pg_publication_rel.

> I think it's necessary to add invalidations to other transactions after
> adding a snapshot change to them.
> Therefore, I submitted a patch for this bug.
>

Sorry, I didn't understand your proposal and I don't see any patch
attached as you are claiming in the last sentence.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2024-01-04 07:43:00 Re: BUG #18264: Table has type text, but query expects integer.attribute 1 of type record has wrong type
Previous Message Michael Paquier 2024-01-04 00:40:56 Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition