Re: long-standing data loss bug in initial sync of logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Nitin Motiani <nitinmotiani(at)google(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: long-standing data loss bug in initial sync of logical replication
Date: 2024-07-15 18:12:45
Message-ID: CALDaNm1uEdsdVRQdGPn--gbpzjqCduwe4cVs8t-Azj7bra5rDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 15 Jul 2024 at 15:31, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Jul 11, 2024 at 6:19 PM Nitin Motiani <nitinmotiani(at)google(dot)com> wrote:
> >
> > On Wed, Jul 10, 2024 at 11:22 PM Nitin Motiani <nitinmotiani(at)google(dot)com> wrote:
> > >
> > > On Wed, Jul 10, 2024 at 10:39 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, 10 Jul 2024 at 12:28, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > The patch missed to use the ShareRowExclusiveLock for partitions, see
> > > > > attached. I haven't tested it but they should also face the same
> > > > > problem. Apart from that, I have changed the comments in a few places
> > > > > in the patch.
> > > >
> > > > I could not hit the updated ShareRowExclusiveLock changes through the
> > > > partition table, instead I could verify it using the inheritance
> > > > table. Added a test for the same and also attaching the backbranch
> > > > patch.
> > > >
> > >
> > > Hi,
> > >
> > > I tested alternative-experimental-fix-lock.patch provided by Tomas
> > > (replaces SUE with SRE in OpenTableList). I believe there are a couple
> > > of scenarios the patch does not cover.
> > >
> > > 1. It doesn't handle the case of "ALTER PUBLICATION <pub> ADD TABLES
> > > IN SCHEMA <schema>".
> > >
> > > I took crash-test.sh provided by Tomas and modified it to add all
> > > tables in the schema to publication using the following command :
> > >
> > > ALTER PUBLICATION p ADD TABLES IN SCHEMA public
> > >
> > > The modified script is attached (crash-test-with-schema.sh). With this
> > > script, I can reproduce the issue even with the patch applied. This is
> > > because the code path to add a schema to the publication doesn't go
> > > through OpenTableList.
> > >
> > > I have also attached a script run-test-with-schema.sh to run
> > > crash-test-with-schema.sh in a loop with randomly generated parameters
> > > (modified from run.sh provided by Tomas).
> > >
> > > 2. The second issue is a deadlock which happens when the alter
> > > publication command is run for a comma separated list of tables.
> > >
> > > I created another script create-test-tables-order-reverse.sh. This
> > > script runs a command like the following :
> > >
> > > ALTER PUBLICATION p ADD TABLE test_2,test_1
> > >
> > > Running the above script, I was able to get a deadlock error (the
> > > output is attached in deadlock.txt). In the alter publication command,
> > > I added the tables in the reverse order to increase the probability of
> > > the deadlock. But it should happen with any order of tables.
> > >
> > > I am not sure if the deadlock is a major issue because detecting the
> > > deadlock is better than data loss.
> > >
>
> The deadlock reported in this case is an expected behavior. This is no
> different that locking tables or rows in reverse order.
>
> >
> > I looked further into the scenario of adding the tables in schema to
> > the publication. Since in that case, the entry is added to
> > pg_publication_namespace instead of pg_publication_rel, the codepaths
> > for 'add table' and 'add tables in schema' are different. And in the
> > 'add tables in schema' scenario, the OpenTableList function is not
> > called to get the relation ids. Therefore even with the proposed
> > patch, the data loss issue still persists in that case.
> >
> > To validate this idea, I tried locking all the affected tables in the
> > schema just before the invalidation for those relations (in
> > ShareRowExclusiveLock mode).
> >
>
> This sounds like a reasonable approach to fix the issue. However, we
> should check SET publication_object as well, especially the drop part
> in it. It should not happen that we miss sending the data for ADD but
> for DROP, we send data when we shouldn't have sent it.

There were few other scenarios, similar to the one you mentioned,
where the issue occurred. For example: a) When specifying a subset of
existing tables in the ALTER PUBLICATION ... SET TABLE command, the
tables that were supposed to be removed from the publication were not
locked in ShareRowExclusiveLock mode. b) The ALTER PUBLICATION ...
DROP TABLES IN SCHEMA command did not lock the relations that will be
removed from the publication in ShareRowExclusiveLock mode. Both of
these scenarios resulted in data inconsistency due to inadequate
locking. The attached patch addresses these issues.

Regards,
Vignesh

Attachment Content-Type Size
v5-0001-Fix-data-loss-during-initial-sync-in-logical-repl.patch text/x-patch 23.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-07-15 18:19:08 Re: Parent/child context relation in pg_get_backend_memory_contexts()
Previous Message Robert Haas 2024-07-15 17:56:05 Re: Parent/child context relation in pg_get_backend_memory_contexts()