Re: long-standing data loss bug in initial sync of logical replication

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Nitin Motiani <nitinmotiani(at)google(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: long-standing data loss bug in initial sync of logical replication
Date: 2024-09-30 04:58:17
Message-ID: CANhcyEVCnY=-Yc8LABADYGK=sJyT7C89ZMWdhTMxsh4vW-39HQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 26 Sept 2024 at 11:39, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
>
> > In the v7 patch, I am looping through the reorder buffer of the
> > current committed transaction and storing all invalidation messages in
> > a list. Then I am distributing those invalidations.
> > But I found that for a transaction we already store all the
> > invalidation messages (see [1]). So we don't need to loop through the
> > reorder buffer and store the invalidations.
> >
> > I have modified the patch accordingly and attached the same.
> >
> > [1]: https://github.com/postgres/postgres/blob/7da1bdc2c2f17038f2ae1900be90a0d7b5e361e0/src/include/replication/reorderbuffer.h#L384
>
> Hi,
>
> I tried to add changes to selectively invalidate the cache to reduce
> the performance degradation during the distribution of invalidations.
>
> Here is the analysis for selective invalidation.
> Observation:
> Currently when there is a change in a publication, cache related to
> all the tables is invalidated including the ones that are not part of
> any publication and even tables of different publications. For
> example, suppose pub1 includes tables t1 to t1000, while pub2 contains
> just table t1001. If pub2 is altered, even though it only has t1001,
> this change will also invalidate all the tables t1 through t1000 in
> pub1.
> Similarly for a namespace, whenever we alter a schema or we add/drop a
> schema to the publication, cache related to all the tables is
> invalidated including the ones that are on of different schema. For
> example, suppose pub1 includes tables t1 to t1000 in schema sc1, while
> pub2 contains just table t1001 in schema sc2. If schema ‘sc2’ is
> changed or if it is dropped from publication ‘pub2’ even though it
> only has t1001, this change will invalidate all the tables t1 through
> t1000 in schema sc1.
> ‘rel_sync_cache_publication_cb’ function is called during the
> execution of invalidation in both above cases. And
> ‘rel_sync_cache_publication_cb’ invalidates all the tables in the
> cache.
>
> Solution:
> 1. When we alter a publication using commands like ‘ALTER PUBLICATION
> pub_name DROP TABLE table_name’, first all tables in the publications
> are invalidated using the function ‘rel_sync_cache_relation_cb’. Then
> again ‘rel_sync_cache_publication_cb’ function is called which
> invalidates all the tables. This happens because of the following
> callback registered:
> CacheRegisterSyscacheCallback(PUBLICATIONRELMAP,
> rel_sync_cache_publication_cb, (Datum) 0);
>
> So, I feel this second function call can be avoided. And I have
> included changes for the same in the patch. Now the behavior will be
> as:
> suppose pub1 includes tables t1 to t1000, while pub2 contains just
> table t1001. If pub2 is altered, it will only invalidate t1001.
>
> 2. When we add/drop a schema to/from a publication using command like
> ‘ALTER PUBLICATION pub_name ADD TABLES in SCHEMA schema_name’, first
> all tables in that schema are invalidated using
> ‘rel_sync_cache_relation_cb’ and then again
> ‘rel_sync_cache_publication_cb’ function is called which invalidates
> all the tables. This happens because of the following callback
> registered:
> CacheRegisterSyscacheCallback(PUBLICATIONNAMESPACEMAP,
> rel_sync_cache_publication_cb, (Datum) 0);
>
> So, I feel this second function call can be avoided. And I have
> included changes for the same in the patch. Now the behavior will be
> as:
> suppose pub1 includes tables t1 to t1000 in schema sc1, while pub2
> contains just table t1001 in schema sc2. If schema ‘sc2’ dropped from
> publication ‘pub2’, it will only invalidate table t1001.
>
> 3. When we alter a namespace using command like ‘ALTER SCHEMA
> schema_name RENAME to new_schema_name’ all the table in cache are
> invalidated as ‘rel_sync_cache_publication_cb’ is called due to the
> following registered callback:
> CacheRegisterSyscacheCallback(NAMESPACEOID,
> rel_sync_cache_publication_cb, (Datum) 0);
>
> So, we added a new callback function ‘rel_sync_cache_namespacerel_cb’
> will be called instead of function ‘rel_sync_cache_publication_cb’ ,
> which invalidates only the cache of the tables which are part of that
> particular namespace. For the new function the ‘namespace id’ is added
> in the Invalidation message.
>
> For example, if namespace ‘sc1’ has table t1 and t2 and a namespace
> ‘sc2’ has table t3. Then if we rename namespace ‘sc1’ to ‘sc_new’.
> Only tables in sc1 i.e. tables t1 and table t2 are invalidated.
>
>
> Performance Comparison:
> I have run the same tests as shared in [1] and observed a significant
> decrease in the degradation with the new changes. With selective
> invalidation degradation is around ~5%. This results are an average of
> 3 runs.
>
> count | Head (sec) | Fix (sec) | Degradation (%)
> -----------------------------------------------------------------------------------------
> 10000 | 0.38842567 | 0.405057 | 4.281727827
> 50000 | 7.22018834 | 7.605011334 | 5.329819333
> 75000 | 15.627181 | 16.38659034 | 4.859541462
> 100000 | 27.37910867 | 28.8636873 | 5.422304458
>
> I have attached the patch for the same
> v9-0001 : distribute invalidation to inprogress transaction
> v9-0002: Selective invalidation
>
> [1]:https://www.postgresql.org/message-id/CANhcyEW4pq6%2BPO_eFn2q%3D23sgV1budN3y4SxpYBaKMJNADSDuA%40mail.gmail.com
>

I have also prepared a bar chart for performance comparison between
HEAD, 0001 patch and (0001+0002) patch and attached here.

Thanks and Regards,
Shlok Kyal

Attachment Content-Type Size
image/png 29.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Junwang Zhao 2024-09-30 05:01:25 Re: general purpose array_sort
Previous Message shveta malik 2024-09-30 04:26:58 Re: Conflict Detection and Resolution