Re: speed up a logical replica setup

From: Noah Misch <noah(at)leadboat(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: speed up a logical replica setup
Date: 2024-06-24 22:08:45
Message-ID: 20240624220845.ba.nmisch@google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 24, 2024 at 05:20:21PM +0530, Amit Kapila wrote:
> On Sun, Jun 23, 2024 at 11:52 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> >
> > > +static void
> > > +create_publication(PGconn *conn, struct LogicalRepInfo *dbinfo)
> > > +{
> >
> > > + appendPQExpBuffer(str, "CREATE PUBLICATION %s FOR ALL TABLES",
> > > + ipubname_esc);
> >
> > This tool's documentation says it "guarantees that no transaction will be
> > lost." I tried to determine whether achieving that will require something
> > like the fix from
> > https://postgr.es/m/flat/de52b282-1166-1180-45a2-8d8917ca74c6(at)enterprisedb(dot)com(dot)
> > (Not exactly the fix from that thread, since that thread has not discussed the
> > FOR ALL TABLES version of its race condition.) I don't know. On the one
> > hand, pg_createsubscriber benefits from creating a logical slot after creating
> > the publication. That snapbuild.c process will wait for running XIDs. On the
> > other hand, an INSERT/UPDATE/DELETE acquires its RowExclusiveLock and builds
> > its relcache entry before assigning an XID, so perhaps the snapbuild.c process

Correction: it doesn't matter how the original INSERT/UPDATE/DELETE builds its
relcache entry, just how pgoutput of the change builds the relcache entry from
the historic snapshot.

> > isn't enough to prevent that thread's race condition. What do you think?
>
> I am not able to imagine how the race condition discussed in the
> thread you quoted can impact this patch. The problem discussed is
> mainly the interaction when we are processing the changes in logical
> decoding w.r.t concurrent DDL (Alter Publication ... Add Table). The
> problem happens because we use the old cache state.

Right. Taking the example from
http://postgr.es/m/20231119021830.d6t6aaxtrkpn743y@awork3.anarazel.de, LSNs
between what that mail calls 4) and 5) are not safely usable as start points.
pg_createsubscriber evades that thread's problem if the consistent_lsn it
passes to pg_replication_origin_advance() can't be in a bad-start-point LSN
span. I cautiously bet the snapbuild.c process achieves that:

> I am missing your
> point about the race condition mentioned in the thread you quoted with
> snapbuild.c. Can you please elaborate a bit more?

When pg_createsubscriber calls pg_create_logical_replication_slot(), the key
part starts at:

/*
* If caller needs us to determine the decoding start point, do so now.
* This might take a while.
*/
if (find_startpoint)
DecodingContextFindStartpoint(ctx);

Two factors protect pg_createsubscriber. First, (a) CREATE PUBLICATION
committed before pg_create_logical_replication_slot() started. Second, (b)
DecodingContextFindStartpoint() waits for running XIDs to complete, via the
process described at the snapbuild.c "starting up in several stages" diagram.
Hence, the consistent_lsn is not in a bad-start-point LSN span. It's fine
even if the original INSERT populated all caches before CREATE PUBLICATION
started and managed to assign an XID only after consistent_lsn. From the
pgoutput perspective, that's indistinguishable from the transaction starting
at its first WAL record, after consistent_lsn. The linked "long-standing data
loss bug in initial sync of logical replication" thread doesn't have (a),
hence its bug. How close is that to accurate?

Thanks,
nm

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2024-06-24 22:20:20 Re: [PATCH] Add ACL (Access Control List) acronym
Previous Message Stan Hu 2024-06-24 21:58:47 [PATCH] Fix type redefinition build errors with macOS SDK 15.0