| From: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com> | 
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> | 
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | RE: promotion related handling in pg_sync_replication_slots() | 
| Date: | 2024-04-22 00:31:19 | 
| Message-ID: | OS0PR01MB5716A763FBFFED894ED8CEED94122@OS0PR01MB5716.jpnprd01.prod.outlook.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Friday, April 19, 2024 4:22 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> On Fri, Apr 19, 2024 at 11:37 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote:
> > > > Please find v8 attached. Changes are:
> > >
> > > Thanks!
> > >
> > > A few comments:
> >
> > Thanks for reviewing.
> >
> > > 1 ===
> > >
> > > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data,
> size_t startup_data_len)
> > >          * slotsync_worker_onexit() but that will need the connection to be
> made
> > >          * global and we want to avoid introducing global for this purpose.
> > >          */
> > > -       before_shmem_exit(slotsync_failure_callback,
> PointerGetDatum(wrconn));
> > > +       before_shmem_exit(slotsync_worker_disconnect,
> > > + PointerGetDatum(wrconn));
> > >
> > > The comment above this change still states "Register the failure
> > > callback once we have the connection", I think it has to be reworded
> > > a bit now that v8 is making use of slotsync_worker_disconnect().
> > >
> > > 2 ===
> > >
> > > +        * Register slotsync_worker_onexit() before we register
> > > +        * ReplicationSlotShmemExit() in BaseInit(), to ensure that during
> exit of
> > > +        * slot sync worker, ReplicationSlotShmemExit() is called first,
> followed
> > > +        * by slotsync_worker_onexit(). Startup process during
> > > + promotion waits for
> > >
> > > Worth to mention in shmem_exit() (where it "while
> (--before_shmem_exit_index >= 0)"
> > > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain()
> > > relies on this LIFO behavior? (not sure if there is other "strong"
> > > LIFO requirement in other part of the code).
> >
> > I see other modules as well relying on LIFO behavior.
> > Please see applyparallelworker.c where
> > 'before_shmem_exit(pa_shutdown)' is needed to be done after
> > 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6).
> > Also in postinit.c, I see such comments atop
> > 'before_shmem_exit(ShutdownPostgres, 0)'.
> > I feel we can skip adding this specific comment about
> > ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has
> > also not added any. I will address the rest of your comments in the
> > next version.
> >
> > > 3 ===
> > >
> > > +        * Startup process during promotion waits for slot sync to finish
> and it
> > > +        * does that by checking the 'syncing' flag.
> > >
> > > worth to mention ShutDownSlotSync()?
> > >
> > > 4 ===
> > >
> > > I did a few tests manually (launching ShutDownSlotSync() through gdb
> > > / with and without sync worker and with / without
> > > pg_sync_replication_slots() running
> > > concurrently) and it looks like it works as designed.
> >
> > Thanks for testing it.
> >
> > > Having said that, the logic that is in place to take care of the
> > > corner cases described up-thread seems reasonable to me.
> 
> Please find v9 with the above comments addressed.
Thanks, the patch looks good to me. I also tested a few concurrent
promotion/function execution cases and didn't find issues.
Best Regards,
Hou zj
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Steele | 2024-04-22 00:47:10 | Re: pg_combinebackup does not detect missing files | 
| Previous Message | Zhijie Hou (Fujitsu) | 2024-04-22 00:27:42 | RE: Disallow changing slot's failover option in transaction block |