Re: long-standing data loss bug in initial sync of logical replication

From: Nitin Motiani <nitinmotiani(at)google(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: long-standing data loss bug in initial sync of logical replication
Date: 2024-09-02 15:49:41
Message-ID: CAH5HC96Uxx+1Mgr2Q4Qwt9QBU3D5Yp_izhVtyLGrvCAcmfXy-g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 20, 2024 at 4:10 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Aug 15, 2024 at 9:31 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > Since we are applying invalidations to all in-progress transactions,
> > the publisher will only replicate half of the transaction data up to
> > the point of invalidation, while the remaining half will not be
> > replicated.
> > Ex:
> > Session1:
> > BEGIN;
> > INSERT INTO tab_conc VALUES (1);
> >
> > Session2:
> > ALTER PUBLICATION regress_pub1 DROP TABLE tab_conc;
> >
> > Session1:
> > INSERT INTO tab_conc VALUES (2);
> > INSERT INTO tab_conc VALUES (3);
> > COMMIT;
> >
> > After the above the subscriber data looks like:
> > postgres=# select * from tab_conc ;
> > a
> > ---
> > 1
> > (1 row)
> >
> > You can reproduce the issue using the attached test.
> > I'm not sure if this behavior is ok. At present, we’ve replicated the
> > first record within the same transaction, but the second and third
> > records are being skipped.
> >
>
> This can happen even without a concurrent DDL if some of the tables in
> the database are part of the publication and others are not. In such a
> case inserts for publicized tables will be replicated but other
> inserts won't. Sending the partial data of the transaction isn't a
> problem to me. Do you have any other concerns that I am missing?
>

Hi,

I think that the partial data replication for one table is a bigger
issue than the case of data being sent for a subset of the tables in
the transaction. This can lead to inconsistent data if the same row is
updated multiple times or deleted in the same transaction. In such a
case if only the partial updates from the transaction are sent to the
subscriber, it might end up with the data which was never visible on
the publisher side.

Here is an example I tried with the patch v8-001 :

I created following 2 tables on the publisher and the subscriber :

CREATE TABLE delete_test(id int primary key, name varchar(100));
CREATE TABLE update_test(id int primary key, name varchar(100));

I added both the tables to the publication p on the publisher and
created a subscription s on the subscriber.

I run 2 sessions on the publisher and do the following :

Session 1 :
BEGIN;
INSERT INTO delete_test VALUES(0, 'Nitin');

Session 2 :
ALTER PUBLICATION p DROP TABLE delete_test;

Session 1 :
DELETE FROM delete_test WHERE id=0;
COMMIT;

After the commit there should be no new row created on the publisher.
But because the partial data was replicated, this is what the select
on the subscriber shows :

SELECT * FROM delete_test;
id | name
----+-----------
0 | Nitin
(1 row)

I don't think the above is a common use case. But this is still an
issue because the subscriber has the data which never existed on the
publisher.

Similar issue can be seen with an update command.

Session 1 :
BEGIN;
INSERT INTO update_test VALUES(1, 'Chiranjiv');

Session 2 :
ALTER PUBLICATION p DROP TABLE update_test;

Session 1:
UPDATE update_test SET name='Eeshan' where id=1;
COMMIT;

After the commit, this is the state on the publisher :
SELECT * FROM update_test;
1 | Eeshan
(1 row)

While this is the state on the subscriber :
SELECT * FROM update_test;
1 | Chiranjiv
(1 row)

I think the update during a transaction scenario might be more common
than deletion right after insertion. But both of these seem like real
issues to consider. Please let me know if I'm missing something.

Thanks & Regards
Nitin Motiani
Google

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-09-02 17:46:21 Re: scalability bottlenecks with (many) partitions (and more)
Previous Message Nazir Bilal Yavuz 2024-09-02 15:02:27 Re: PG_TEST_EXTRA and meson