Re: promotion related handling in pg_sync_replication_slots()

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: promotion related handling in pg_sync_replication_slots()
Date: 2024-04-19 05:23:08
Message-ID: ZiH/vE46X27IxGfJ@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote:
> Please find v8 attached. Changes are:

Thanks!

A few comments:

1 ===

@@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len)
* slotsync_worker_onexit() but that will need the connection to be made
* global and we want to avoid introducing global for this purpose.
*/
- before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn));
+ before_shmem_exit(slotsync_worker_disconnect, PointerGetDatum(wrconn));

The comment above this change still states "Register the failure callback once
we have the connection", I think it has to be reworded a bit now that v8 is
making use of slotsync_worker_disconnect().

2 ===

+ * Register slotsync_worker_onexit() before we register
+ * ReplicationSlotShmemExit() in BaseInit(), to ensure that during exit of
+ * slot sync worker, ReplicationSlotShmemExit() is called first, followed
+ * by slotsync_worker_onexit(). Startup process during promotion waits for

Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index >= 0)"
or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on
this LIFO behavior? (not sure if there is other "strong" LIFO requirement in
other part of the code).

3 ===

+ * Startup process during promotion waits for slot sync to finish and it
+ * does that by checking the 'syncing' flag.

worth to mention ShutDownSlotSync()?

4 ===

I did a few tests manually (launching ShutDownSlotSync() through gdb / with and
without sync worker and with / without pg_sync_replication_slots() running
concurrently) and it looks like it works as designed.

Having said that, the logic that is in place to take care of the corner cases
described up-thread seems reasonable to me.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Japin Li 2024-04-19 05:25:05 Re: Cannot find a working 64-bit integer type on Illumos
Previous Message Michael Paquier 2024-04-19 05:06:04 Direct SSL connection with ALPN and HBA rules