Re: Logical Replication of sequences

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Logical Replication of sequences
Date: 2024-08-06 03:19:23
Message-ID: CAJpy0uC-PDYf8nvng3LPi2DJYFURZVDg_21WmESc22QrqF6jNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 5, 2024 at 5:28 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Aug 5, 2024 at 2:36 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Fri, 2 Aug 2024 at 14:24, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Aug 1, 2024 at 9:26 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > > >
> > > > On Mon, Jul 29, 2024 at 4:17 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > > >
> > > > > Thanks for reporting this, these issues are fixed in the attached
> > > > > v20240730_2 version patch.
> > > > >
> > >
> > > I was reviewing the design of patch003, and I have a query. Do we need
> > > to even start an apply worker and create replication slot when
> > > subscription created is for 'sequences only'? IIUC, currently logical
> > > replication apply worker is the one launching sequence-sync worker
> > > whenever needed. I think it should be the launcher doing this job and
> > > thus apply worker may even not be needed for current functionality of
> > > sequence sync?
> >
>
> But that would lead to maintaining all sequence-sync of each
> subscription by launcher. Say there are 100 sequences per subscription
> and some of them from each subscription are failing due to some
> reasons then the launcher will be responsible for ensuring all the
> sequences are synced. I think it would be better to handle
> per-subscription work by the apply worker.

I thought we can give that task to sequence-sync worker. Once sequence
sync worker is started by launcher, it keeps on syncing until all the
sequences are synced (even failed ones) and then exits only after all
are synced; instead of apply worker starting it multiple times for
failed sequences. Launcher to start sequence sync worker when signaled
by 'alter-sub refresh seq'.
But after going through details given by Vignesh in [1], I also see
the benefits of using apply worker for this task. Since apply worker
is already looping and doing that for table-sync, we can reuse the
same code for sequence sync and maintenance will be easy. So looks
okay if we go with existing apply worker design.

[1]: https://www.postgresql.org/message-id/CALDaNm1KO8f3Fj%2BRHHXM%3DUSGwOcW242M1jHee%3DX_chn2ToiCpw%40mail.gmail.com

>
> >
> > Going forward when we implement incremental sync of
> > > sequences, then we may have apply worker started but now it is not
> > > needed.
> >
> > I believe the current method of having the apply worker initiate the
> > sequence sync worker is advantageous for several reasons:
> > a) Reduces Launcher Load: This approach prevents overloading the
> > launcher, which must handle various other subscription requests.
> > b) Facilitates Incremental Sync: It provides a more straightforward
> > path to extend support for incremental sequence synchronization.
> > c) Reuses Existing Code: It leverages the existing tablesync worker
> > code for starting the tablesync process, avoiding the need to
> > duplicate code in the launcher.
> > d) Simplified Code Maintenance: Centralizing sequence synchronization
> > logic within the apply worker can simplify code maintenance and
> > updates, as changes will only need to be made in one place rather than
> > across multiple components.
> > e) Better Monitoring and Debugging: With sequence synchronization
> > being handled by the apply worker, you can more effectively monitor
> > and debug synchronization processes since all related operations are
> > managed by a single component.
> >
> > Also, I noticed that even when a publication has no tables, we create
> > replication slot and start apply worker.
> >
>
> As far as I understand slots and origins are primarily required for
> incremental sync. Would it be used only for sequence-sync cases? If
> not then we can avoid creating those. I agree that it would add some
> complexity to the code with sequence-specific checks, so we can create
> a top-up patch for this if required and evaluate its complexity versus
> the benefit it produces.
>
> --
> With Regards,
> Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonathan S. Katz 2024-08-06 03:24:46 2024-08-08 update release announcement draft
Previous Message Alexander Korotkov 2024-08-06 02:17:10 Re: [HACKERS] make async slave to wait for lsn to be replayed