Re: Allow logical failover slots to wait on synchronous replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: John H <johnhyvr(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow logical failover slots to wait on synchronous replication
Date: 2024-08-27 06:00:08
Message-ID: CAA4eK1KXhK3QeNv9h5=Nrf8o0bx0y1ZwJYr6O+KBRmdkcRZr1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 27, 2024 at 12:58 AM John H <johnhyvr(at)gmail(dot)com> wrote:
>
> For instance, in Shveta's suggestion of
>
> > > > We can perform this test with both of the below settings and say make
> > > > D and E slow in sending responses:
> > > > 1) synchronous_standby_names = 'ANY 3 (A,B,C,D,E)'
> > > > 2) standby_slot_names = A_slot, B_slot, C_slot, D_slot, E_slot.
>
> if the server associated with E_slot is just down or undergoing
> some sort of maintenance, then all logical consumers would start lagging until
> the server is back up. I could also mimic a network lag of 20 seconds
> and it's guaranteed
> that this patch will perform better.
>

I wanted a simple test where in the first case you use
synchronous_standby_names = 'ANY 3 (A,B,C,D,E)' and in the second case
use standby_slot_names = A_slot, B_slot, C_slot, D_slot, E_slot. You
can try some variations of it as well. The idea is that even if the
performance is less for synchronous_standby_names configuration, we
should be able to document it. This will help users to decide what is
best for them.

> I re-ran the benchmarks with a longer run time of 3 hours, and testing
> a new shared cache
> for walsenders to check the value before obtaining the SyncRepLock.
>
> I also saw I was being throttled on storage in my previous benchmarks
> so I moved to a new setup.
> I benchmarked a new test case with an additional shared cache between
> all the walsenders to
> reduce potential contention on SyncRepLock, and have attached said patch.
>
> Database: Writer on it's own disk, 5 RRs on the other disk together
> Client: 10 logical clients, pgbench running from here as well
>
> 'pgbench -c 32 -j 4 -T 10800 -U "ec2-user" -d postgres -r -P 1'
>
> # Test failover_slots with synchronized_standby_slots = 'rr_1, rr_2,
> rr_3, rr_4, rr_5'
> latency average = 10.683 ms
> latency stddev = 11.851 ms
> initial connection time = 145.876 ms
> tps = 2994.595673 (without initial connection time)
>
> # Test failover_slots waiting on sync_rep no new shared cache
> latency average = 10.684 ms
> latency stddev = 12.247 ms
> initial connection time = 142.561 ms
> tps = 2994.160136 (without initial connection time)
> statement latencies in milliseconds and failures:
>
> # Test failover slots with additional shared cache
> latency average = 10.674 ms
> latency stddev = 11.917 ms
> initial connection time = 142.486 ms
> tps = 2997.315874 (without initial connection time)
>
> The tps improvement between no cache and shared_cache seems marginal, but we do
> see the slight improvement in stddev which makes sense from a
> contention perspective.
>

What is the difference between "Test failover_slots with
synchronized_standby_slots = 'rr_1, rr_2,
> rr_3, rr_4, rr_5'" and "Test failover_slots waiting on sync_rep no new shared cache"? I want to know what configuration did you used for synchronous_standby_names in the latter case.

> I think the cache would demonstrate a lot more improvement if we had
> say 1000 logical slots
> and all of them are trying to obtain SyncRepLock for updating its values.
>
> I've attached the patch but don't feel particularly strongly about the
> new shared LSN values.
>

I am also not sure especially as the test results didn't shown much
improvement and the code also becomes bit complicated. BTW, in the
0003 version in the below code:
+ /* Cache values to reduce contention */
+ LWLockAcquire(SyncRepLock, LW_SHARED);
+ memcpy((XLogRecPtr *) walsndctl->cached_lsn, (XLogRecPtr *)
walsndctl->lsn, sizeof(lsn));
+ LWLockRelease(SyncRepLock);

Which mode lsn is being copied? I am not sure if I understood this
part of the code.

In the 0002 version, in the following code [1], you are referring to
LSN mode which is enabled for logical walsender irrespective of the
mode used by the physical walsender. It is possible that they are
always the same but that is not evident from the code or comments in
the patch.
[1] :
+ /* Cache values to reduce contention on lock */
+ for (i = 0; i < NUM_SYNC_REP_WAIT_MODE; i++)
+ {
+ lsn[i] = walsndctl->lsn[i];
+ }

- ss_oldest_flush_lsn = min_restart_lsn;
+ LWLockRelease(SyncRepLock);

- return true;
+ if (lsn[mode] >= wait_for_lsn)
+ return true;

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Clift 2024-08-27 06:00:13 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Previous Message Jeff Davis 2024-08-27 05:42:27 Re: Introduce new multi insert Table AM and improve performance of various SQL commands with it for Heap AM