Re: Initial Schema Sync for Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Kumar, Sachin" <ssetiya(at)amazon(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Initial Schema Sync for Logical Replication
Date: 2023-03-24 04:00:03
Message-ID: CAA4eK1KWzAamnfB=YymAi3d_QuL=02dKXoG4QMdmYP-gpabR0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 23, 2023 at 9:24 PM Kumar, Sachin <ssetiya(at)amazon(dot)com> wrote:
>
> > From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > IIUC, this is possible only if tablesync process uses a snapshot different than the
> > snapshot we have used to perform the initial schema sync, otherwise, this
> > shouldn't be a problem. Let me try to explain my understanding with an example
> > (the LSNs used are just explain the
> > problem):
> >
> > 1. Create Table t1(c1, c2); --LSN: 90
> > 2. Insert t1 (1, 1); --LSN 100
> > 3. Insert t1 (2, 2); --LSN 110
> > 4. Alter t1 Add Column c3; --LSN 120
> > 5. Insert t1 (3, 3, 3); --LSN 130
> >
> > Now, say before starting tablesync worker, apply process performs initial
> > schema sync and uses a snapshot corresponding to LSN 100. Then it starts
> > tablesync process to allow the initial copy of data in t1.
> > Here, if the table sync process tries to establish a new snapshot, it may get data
> > till LSN 130 and when it will try to copy the same in subscriber it will fail. Is my
> > understanding correct about the problem you described?
> Right
> > If so, can't we allow
> > tablesync process to use the same exported snapshot as we used for the initial
> > schema sync and won't that solve the problem you described?
> I think we won't be able to use same snapshot because the transaction will be committed.
> In CreateSubscription() we can use the transaction snapshot from walrcv_create_slot()
> till walrcv_disconnect() is called.(I am not sure about this part maybe walrcv_disconnect() calls
> the commits internally ?).
> So somehow we need to keep this snapshot alive, even after transaction is committed(or delay committing
> the transaction , but we can have CREATE SUBSCRIPTION with ENABLED=FALSE, so we can have a restart before
> tableSync is able to use the same snapshot.)
>

Can we think of getting the table data as well along with schema via
pg_dump? Won't then both schema and initial data will correspond to
the same snapshot?

> > > Refresh publication :-
> > >
> > > In refresh publication, subscriber does create a new replication slot
> Typo-> subscriber does not
> > > hence , we can’t run
> > >
> > > pg_dump with a snapshot which starts from origin(maybe this is not an
> > > issue at all). In this case
> > >
> > > it makes more sense for tableSync worker to do schema sync.
> > >
> >
> > Can you please explain this problem with some examples?
> I think we can have same issues as you mentioned
> New table t1 is added to the publication , User does a refresh publication.
> pg_dump / pg_restore restores the table definition. But before tableSync
> can start, steps from 2 to 5 happen on the publisher.
> > 1. Create Table t1(c1, c2); --LSN: 90
> > 2. Insert t1 (1, 1); --LSN 100
> > 3. Insert t1 (2, 2); --LSN 110
> > 4. Alter t1 Add Column c3; --LSN 120
> > 5. Insert t1 (3, 3, 3); --LSN 130
> And table sync errors out
> There can be one more issue , since we took the pg_dump without snapshot (wrt to replication slot).
>

To avoid both the problems mentioned for Refresh Publication, we can
do one of the following: (a) create a new slot along with a snapshot
for this operation and drop it afterward; or (b) using the existing
slot, establish a new snapshot using a technique proposed in email
[1].

Note - Please keep one empty line before and after your inline
responses, otherwise, it is slightly difficult to understand your
response.

[1] - https://www.postgresql.org/message-id/CAGPVpCRWEVhXa7ovrhuSQofx4to7o22oU9iKtrOgAOtz_%3DY6vg%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2023-03-24 04:05:38 Re: Doc: Improve note about copying into postgres_fdw foreign tables in batch
Previous Message Greg Stark 2023-03-24 03:52:31 Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)