From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | John H <johnhyvr(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Allow logical failover slots to wait on synchronous replication |
Date: | 2024-06-17 15:19:50 |
Message-ID: | ZnBUFvbkFdF3qSOB@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Jun 11, 2024 at 10:00:46AM +0000, Bertrand Drouvot wrote:
> Hi,
>
> On Mon, Jun 10, 2024 at 09:25:10PM -0500, Nathan Bossart wrote:
> > On Mon, Jun 10, 2024 at 03:51:05PM -0700, John H wrote:
> > > The existing 'standby_slot_names' isn't great for users who are running
> > > clusters with quorum-based synchronous replicas. For instance, if
> > > the user has synchronous_standby_names = 'ANY 3 (A,B,C,D,E)' it's a
> > > bit tedious to have to reconfigure the standby_slot_names to set it to
> > > the most updated 3 sync replicas whenever different sync replicas start
> > > lagging. In the event that both GUCs are set, 'standby_slot_names' takes
> > > precedence.
> >
> > Hm. IIUC you'd essentially need to set standby_slot_names to "A,B,C,D,E"
> > to get the desired behavior today. That might ordinarily be okay, but it
> > could cause logical replication to be held back unnecessarily if one of the
> > replicas falls behind for whatever reason. A way to tie standby_slot_names
> > to synchronous replication instead does seem like it would be useful in
> > this case.
>
> FWIW, I have the same understanding and also think your proposal would be
> useful in this case.
A few random comments about v1:
1 ====
+ int mode = SyncRepWaitMode;
It's set to SyncRepWaitMode and then never change. Worth to get rid of "mode"?
2 ====
+ static XLogRecPtr lsn[NUM_SYNC_REP_WAIT_MODE] = {InvalidXLogRecPtr};
I did some testing and saw that the lsn[] values were not always set to
InvalidXLogRecPtr right after. It looks like that, in that case, we should
avoid setting the lsn[] values at compile time. Then, what about?
1. remove the "static".
or
2. keep the static but set the lsn[] values after its declaration.
3 ====
- /*
- * Return false if not all the standbys have caught up to the specified
- * WAL location.
- */
- if (caught_up_slot_num != standby_slot_names_config->nslotnames)
- return false;
+ if (!XLogRecPtrIsInvalid(lsn[mode]) && lsn[mode] >= wait_for_lsn)
+ return true;
lsn[] values are/(should have been, see 2 above) just been initialized to
InvalidXLogRecPtr so that XLogRecPtrIsInvalid(lsn[mode]) will always return
true. I think this check is not needed then.
4 ====
> > > I did some very brief pgbench runs to compare the latency. Client instance
> > > was running pgbench and 10 logical clients while the Postgres box hosted
> > > the writer and 5 synchronous replicas.
> > > There's a hit to TPS
Out of curiosity, did you compare with standby_slot_names_from_syncrep set to off
and standby_slot_names not empty?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Alena Rybakina | 2024-06-17 15:21:09 | Re: POC, WIP: OR-clause support for indexes |
Previous Message | Nathan Bossart | 2024-06-17 15:09:53 | Re: Introduce XID age and inactive timeout based replication slot invalidation |