Re: Allow logical failover slots to wait on synchronous replication

From: shveta malik <shveta(dot)malik(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: John H <johnhyvr(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, shveta malik <shveta(dot)malik(at)gmail(dot)com>
Subject: Re: Allow logical failover slots to wait on synchronous replication
Date: 2024-07-26 09:58:08
Message-ID: CAJpy0uBV9ufvo+PHFWK59HgnMq_ywf8BOvCVBkjue0ZiRSC0ZA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 23, 2024 at 10:35 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Tue, Jul 9, 2024 at 12:39 AM John H <johnhyvr(at)gmail(dot)com> wrote:
> >
> > > Out of curiosity, did you compare with standby_slot_names_from_syncrep set to off
> > > and standby_slot_names not empty?
> >
> > I didn't think 'standby_slot_names' would impact TPS as much since
> > it's not grabbing the SyncRepLock but here's a quick test.
> > Writer with 5 synchronous replicas, 10 pg_recvlogical clients and
> > pgbench all running from the same server.
> >
> > Command: pgbench -c 4 -j 4 -T 600 -U "ec2-user" -d postgres -r -P 5
> >
> > Result with: standby_slot_names =
> > 'replica_1,replica_2,replica_3,replica_4,replica_5'
> >
> > latency average = 5.600 ms
> > latency stddev = 2.854 ms
> > initial connection time = 5.503 ms
> > tps = 714.148263 (without initial connection time)
> >
> > Result with: standby_slot_names_from_syncrep = 'true',
> > synchronous_standby_names = 'ANY 3 (A,B,C,D,E)'
> >
> > latency average = 5.740 ms
> > latency stddev = 2.543 ms
> > initial connection time = 4.093 ms
> > tps = 696.776249 (without initial connection time)
> >
> > Result with nothing set:
> >
> > latency average = 5.090 ms
> > latency stddev = 3.467 ms
> > initial connection time = 4.989 ms
> > tps = 785.665963 (without initial connection time)
> >
> > Again I think it's possible to improve the synchronous numbers if we
> > cache but I'll try that out in a bit.
> >
>
> Okay, so the tests done till now conclude that we won't get the
> benefit by using 'standby_slot_names_from_syncrep'. Now, if we
> increase the number of standby's in both lists and still keep ANY 3 in
> synchronous_standby_names then the results may vary. We should try to
> find out if there is a performance benefit with the use of
> synchronous_standby_names in the normal configurations like the one
> you used in the above tests to prove the value of this patch.
>

I didn't fully understand the parameters mentioned above, specifically
what 'latency stddev' and 'latency average' represent.. But shouldn't
we see the benefit/value of this patch by having a setup where a
particular standby is slow in sending the response back to primary
(could be due to network lag or other reasons) and then measuring the
latency in receiving changes on failover-enabled logical subscribers?
We can perform this test with both of the below settings and say make
D and E slow in sending responses:
1) synchronous_standby_names = 'ANY 3 (A,B,C,D,E)'
2) standby_slot_names = A_slot, B_slot, C_slot, D_slot, E_slot.

thanks
Shveta

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-07-26 10:07:24 Re: Conflict detection and logging in logical replication
Previous Message Richard Guo 2024-07-26 09:44:33 Re: Reuse child_relids in try_partitionwise_join was Re: Assert failure on bms_equal(child_joinrel->relids, child_joinrelids)