Re: Synchronizing slots from primary to standby

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ajin Cherian <itsajin(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: Synchronizing slots from primary to standby
Date: 2024-01-22 07:41:19
Message-ID: Za4cH83vTyZYNcYK@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Fri, Jan 19, 2024 at 05:23:53PM +0530, Amit Kapila wrote:
> On Wed, Jan 17, 2024 at 4:00 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
>
> Now, the concerns related to this could be that users would probably
> need to change existing mechanisms/tools to update priamry_conninfo

Yeah, for the ones that want the sync slot feature.

> and one of the alternatives proposed is to have an additional GUC like
> slot_sync_dbname. Users won't be able to drop the database this worker
> is connected to aka whatever is specified in slot_sync_dbname but as
> the user herself sets up the configuration it shouldn't be a big deal.

Same point of view here.

> Then we also discussed whether extending libpqwalreceiver's connect
> API is a good idea and whether we need to further extend it in the
> future. As far as I can see, slotsync worker's primary requirement is
> to execute SQL queries which the current API is sufficient, and don't
> see something that needs any drastic change in this API. Note that
> tablesync worker that executes SQL also uses these APIs, so we may
> need something in the future for either of those. Then finally we need
> a slotsync worker to also connect to a database to use SQL and fetch
> results.
>

On my side the nits concerns about using the libpqrcv_connect / walrcv_connect are:

- cosmetic: the "rcv" do not really align with the sync slot worker
- we're using a WalReceiverConn, while a PGconn should suffice. From what I can
see the "overhead" is (1 byte + 7 bytes hole + 8 bytes). I don't think that's a
big deal even if we switch to a multi sync slot worker design later on.

Those have already been discussed in [1] and I'm fine with them.

> Now, let us consider if we extend the replication commands like
> READ_REPLICATION_SLOT and or introduce a new set of replication
> commands to fetch the required information then we don't need a DB
> connection with primary or a connection in slotsync worker. As per my
> current understanding, it is quite doable but I think we will slowly
> go in the direction of making replication commands something like SQL
> because today we need to extend it to fetch all slots info that have
> failover marked as true, the existence of a particular replication,
> etc. Then tomorrow, if we want to extend this work to have multiple
> slotsync workers say workers perdb then we have to extend the
> replication command to fetch per-database failover marked slots. To
> me, it sounds more like we are slowly adding SQL-like features to
> replication commands.

Agree. Also it seems to me that extending the replication commands is more like
a one-way door change.

> Apart from this when we are reading per-db replication slots without
> connecting to a database, we probably need some additional protection
> mechanism so that the database won't get dropped.
>
> Considering all this it seems that for now probably extending
> replication commands can simplify a few things like mentioned above
> but using SQL's with db-connection is more extendable.

I'd vote for using a SQL db-connection (like we are doing currently).
It seems more extendable and more a two-way door (as compared to extending the
replication commands): I think it still gives us the flexibility to switch to
extending the replication commands if we want to in the future.

[1]: https://www.postgresql.org/message-id/ZZe6sok7IWmhKReU%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-01-22 07:52:52 Re: introduce dynamic shared memory registry
Previous Message Michael Paquier 2024-01-22 07:36:27 Re: Use of backup_label not noted in log