Re: long-standing data loss bug in initial sync of logical replication

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: long-standing data loss bug in initial sync of logical replication
Date: 2024-08-20 12:19:25
Message-ID: CALDaNm3JwoEMUf_Zz+6WB9xuHotSa1_eStGpZ02XMQSTRv+p=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 20 Aug 2024 at 16:10, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Aug 15, 2024 at 9:31 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Thu, 8 Aug 2024 at 16:24, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> > >
> > > On Wed, 31 Jul 2024 at 11:17, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > >
> > > Created a patch for distributing invalidations.
> > > Here we collect the invalidation messages for the current transaction
> > > and distribute it to all the inprogress transactions, whenever we are
> > > distributing the snapshots..Thoughts?
> >
> > Since we are applying invalidations to all in-progress transactions,
> > the publisher will only replicate half of the transaction data up to
> > the point of invalidation, while the remaining half will not be
> > replicated.
> > Ex:
> > Session1:
> > BEGIN;
> > INSERT INTO tab_conc VALUES (1);
> >
> > Session2:
> > ALTER PUBLICATION regress_pub1 DROP TABLE tab_conc;
> >
> > Session1:
> > INSERT INTO tab_conc VALUES (2);
> > INSERT INTO tab_conc VALUES (3);
> > COMMIT;
> >
> > After the above the subscriber data looks like:
> > postgres=# select * from tab_conc ;
> > a
> > ---
> > 1
> > (1 row)
> >
> > You can reproduce the issue using the attached test.
> > I'm not sure if this behavior is ok. At present, we’ve replicated the
> > first record within the same transaction, but the second and third
> > records are being skipped.
> >
>
> This can happen even without a concurrent DDL if some of the tables in
> the database are part of the publication and others are not. In such a
> case inserts for publicized tables will be replicated but other
> inserts won't. Sending the partial data of the transaction isn't a
> problem to me. Do you have any other concerns that I am missing?

My main concern was about sending only part of the data from a
transaction table and leaving out the rest. However, since this is
happening elsewhere as well, I'm okay with it.

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-08-20 12:19:54 Re: CREATE SUBSCRIPTION - add missing test case
Previous Message Bertrand Drouvot 2024-08-20 12:16:10 Re: define PG_REPLSLOT_DIR