Re: Synchronizing slots from primary to standby

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ajin Cherian <itsajin(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-02-06 09:53:06
Message-ID: CAFiTN-tnZ98oGF6iTL4Yzgi4bSD9oRPgM+MjnjGF5UKGvWk=Nw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 6, 2024 at 1:09 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Feb 6, 2024 at 3:19 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Mon, Feb 5, 2024 at 7:56 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > ---
> > > Since Two processes (e.g. the slotsync worker and
> > > pg_sync_replication_slots()) concurrently fetch and update the slot
> > > information, there is a race condition where slot's
> > > confirmed_flush_lsn goes backward.
> > >
> >
> > Right, this is possible, though there shouldn't be a problem because
> > anyway, slotsync is an async process. Till we hold restart_lsn, the
> > required WAL won't be removed. Having said that, I can think of two
> > ways to avoid it: (a) We can have some flag in shared memory using
> > which we can detect whether any other process is doing slot
> > syncronization and then either error out at that time or simply wait
> > or may take nowait kind of parameter from user to decide what to do?
> > If this is feasible, we can simply error out for the first version and
> > extend it later if we see any use cases for the same (b) similar to
> > restart_lsn, if confirmed_flush_lsn is getting moved back, raise an
> > error, this is good for now but in future we may still have another
> > similar issue, so I would prefer (a) among these but I am fine if you
> > prefer (b) or have some other ideas like just note down in comments
> > that this is a harmless case and can happen only very rarely.
>
> Thank you for sharing the ideas. I would prefer (a). For (b), the same
> issue still happens for other fields.

I agree that (a) looks better. On a separate note, while looking at
this API pg_sync_replication_slots(PG_FUNCTION_ARGS) shouldn't there
be an optional parameter to give one slot or multiple slots or all
slots as default, that will give better control to the user no?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-02-06 10:03:50 Re: Synchronizing slots from primary to standby
Previous Message Daniel Gustafsson 2024-02-06 09:40:27 Re: Memory leak fix in rmtree.c