From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Kumar, Sachin" <ssetiya(at)amazon(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Initial Schema Sync for Logical Replication |
Date: | 2023-03-27 02:47:01 |
Message-ID: | CAD21AoAzibaj_vyogYk5c9wCkP5T1n8s-iqPMbrTdxQ3JZ54Rg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 24, 2023 at 11:51 PM Kumar, Sachin <ssetiya(at)amazon(dot)com> wrote:
>
> > From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> > > I think we won't be able to use same snapshot because the transaction will
> > > be committed.
> > > In CreateSubscription() we can use the transaction snapshot from
> > > walrcv_create_slot() till walrcv_disconnect() is called.(I am not sure
> > > about this part maybe walrcv_disconnect() calls the commits internally ?).
> > > So somehow we need to keep this snapshot alive, even after transaction
> > > is committed(or delay committing the transaction , but we can have
> > > CREATE SUBSCRIPTION with ENABLED=FALSE, so we can have a restart
> > > before tableSync is able to use the same snapshot.)
> > >
> >
> > Can we think of getting the table data as well along with schema via
> > pg_dump? Won't then both schema and initial data will correspond to the
> > same snapshot?
>
> Right , that will work, Thanks!
While it works, we cannot get the initial data in parallel, no?
>
> > > I think we can have same issues as you mentioned New table t1 is added
> > > to the publication , User does a refresh publication.
> > > pg_dump / pg_restore restores the table definition. But before
> > > tableSync can start, steps from 2 to 5 happen on the publisher.
> > > > 1. Create Table t1(c1, c2); --LSN: 90 2. Insert t1 (1, 1); --LSN 100
> > > > 3. Insert t1 (2, 2); --LSN 110 4. Alter t1 Add Column c3; --LSN 120
> > > > 5. Insert t1 (3, 3, 3); --LSN 130
> > > And table sync errors out
> > > There can be one more issue , since we took the pg_dump without
> > snapshot (wrt to replication slot).
> > >
> >
> > To avoid both the problems mentioned for Refresh Publication, we can do
> > one of the following: (a) create a new slot along with a snapshot for this
> > operation and drop it afterward; or (b) using the existing slot, establish a
> > new snapshot using a technique proposed in email [1].
> >
>
> Thanks, I think option (b) will be perfect, since we don’t have to create a new slot.
Regarding (b), does it mean that apply worker stops streaming,
requests to create a snapshot, and then resumes the streaming?
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Guo | 2023-03-27 02:57:59 | Re: About the constant-TRUE clause in reconsider_outer_join_clauses |
Previous Message | Peter Geoghegan | 2023-03-27 02:34:49 | Re: Amcheck verification of GiST and GIN |