Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-08-06 07:14:17
Message-ID: CALDaNm36xdZ0_Q2LqUVdJTRnm2YFNQ6ettzD3p0d54B6_VHLTQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 6 Aug 2024 at 10:24, shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Tue, Aug 6, 2024 at 9:54 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Tue, Aug 6, 2024 at 8:49 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > > > > I was reviewing the design of patch003, and I have a query. Do we need
> > > > > > to even start an apply worker and create replication slot when
> > > > > > subscription created is for 'sequences only'? IIUC, currently logical
> > > > > > replication apply worker is the one launching sequence-sync worker
> > > > > > whenever needed. I think it should be the launcher doing this job and
> > > > > > thus apply worker may even not be needed for current functionality of
> > > > > > sequence sync?
> > > > >
> > > >
> > > > But that would lead to maintaining all sequence-sync of each
> > > > subscription by launcher. Say there are 100 sequences per subscription
> > > > and some of them from each subscription are failing due to some
> > > > reasons then the launcher will be responsible for ensuring all the
> > > > sequences are synced. I think it would be better to handle
> > > > per-subscription work by the apply worker.
> > >
> > > I thought we can give that task to sequence-sync worker. Once sequence
> > > sync worker is started by launcher, it keeps on syncing until all the
> > > sequences are synced (even failed ones) and then exits only after all
> > > are synced; instead of apply worker starting it multiple times for
> > > failed sequences. Launcher to start sequence sync worker when signaled
> > > by 'alter-sub refresh seq'.
> > > But after going through details given by Vignesh in [1], I also see
> > > the benefits of using apply worker for this task. Since apply worker
> > > is already looping and doing that for table-sync, we can reuse the
> > > same code for sequence sync and maintenance will be easy. So looks
> > > okay if we go with existing apply worker design.
> > >
> >
> > Fair enough. However, I was wondering whether apply_worker should exit
> > after syncing all sequences for a sequence-only subscription
>
> If apply worker exits, then on next sequence-refresh, we need a way to
> wake-up launcher to start apply worker which then will start
> table-sync worker. Instead, won't it be better if the launcher starts
> table-sync worker directly without the need of apply worker being
> present (which I stated earlier).

I favour the current design because it ensures the system remains
extendable for future incremental sequence synchronization. If the
launcher were responsible for starting the sequence sync worker, it
would add extra load that could hinder its ability to service other
subscriptions and complicate the design for supporting incremental
sync of sequences. Additionally, this approach offers the other
benefits mentioned in [1].

> > or should
> > it be there for future commands that can refresh the subscription and
> > add additional tables or sequences?
>
> If we stick with apply worker starting table sync worker when needed
> by continuously checking seq-sync states ('i'/'r'), then IMO, it is
> better that apply-worker stays. But if we want apply-worker to exit
> and start only when needed, then why not to start sequence-sync worker
> directly for seq-only subscriptions?

There is a risk that sequence synchronization might fail if the
sequence value from the publisher falls outside the defined minvalue
or maxvalue range. The apply worker must be active to determine
whether to initiate the sequence sync worker after the
wal_retrieve_retry_interval period. Typically, publications consisting
solely of sequences are uncommon. However, if a user wishes to use
such publications, they can disable the subscription if necessary and
re-enable it when a sequence refresh is needed.

[1] - https://www.postgresql.org/message-id/CALDaNm1KO8f3Fj%2BRHHXM%3DUSGwOcW242M1jHee%3DX_chn2ToiCpw%40mail.gmail.com

Regards,
Vignesh

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-08-06 07:23:25 Re: Remove support for old realpath() API
Previous Message Michael Paquier 2024-08-06 07:04:01 Re: remove volatile qualifiers from pg_stat_statements