From: | Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com, smithpb2250(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: [HACKERS] make async slave to wait for lsn to be replayed |
Date: | 2024-03-19 22:34:51 |
Message-ID: | a1893fee413a6a2f387f419049b29a99@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bharath Rupireddy, thank you for you review.
But here is some points.
On 2024-03-16 10:02, Bharath Rupireddy wrote:
> 4.1 With invalid LSN succeeds, shouldn't it error out? Or at least,
> add a fast path/quick exit to WaitForLSN()?
> BEGIN AFTER '0/0';
In postgresql '0/0' is Valid pg_lsn, but it is always reached.
> 4.2 With an unreasonably high future LSN, BEGIN command waits
> unboundedly, shouldn't we check if the specified LSN is more than
> pg_last_wal_receive_lsn() error out?
> BEGIN AFTER '0/FFFFFFFF';
> SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
> BEGIN AFTER :'future_receive_lsn';
This case will give ERROR cause '0/FFFFFFFF' + 1 is invalid pg_lsn
> 4.3 With an unreasonably high wait time, BEGIN command waits
> unboundedly, shouldn't we restrict the wait time to some max value,
> say a day or so?
> SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
> BEGIN AFTER :'future_receive_lsn' WITHIN 100000;
Good idea, I put it 1 day. But this limit we should to discuss.
> 6.
> +/* Set all latches in shared memory to signal that new LSN has been
> replayed */
> +void
> +WaitLSNSetLatches(XLogRecPtr curLSN)
> +{
>
> I see this patch is waking up all the waiters in the recovery path
> after applying every WAL record, which IMO is a hot path. Is the
> impact of this change on recovery measured, perhaps using
> https://github.com/macdice/redo-bench or similar tools?
>
> 7. In continuation to comment #6, why not use Conditional Variables
> instead of proc latches to sleep and wait for all the waiters in
> WaitLSNSetLatches?
Waiters are stored in the array sorted by LSN. This help us to wake
only PIDs with replayed LSN. This saves us from scanning of whole
array. So it`s not so hot path.
Add some fixes
1) make waiting timeont more simple (as pg_terminate_backend())
2) removed the 1 minute wait because INTERRUPTS don’t arrive for a
long time, changed it to 0.5 seconds
3) add more tests
4) added and expanded sections in the documentation
5) add default variant of timeout
pg_wait_lsn(trg_lsn pg_lsn, delay int8 DEFAULT 0)
example: pg_wait_lsn('0/31B1B60') equal pg_wait_lsn('0/31B1B60', 0)
6) now big timeout will be restricted to 1 day (86400000ms)
CALL pg_wait_lsn('0/34FB5A1',10000000000);
WARNING: Timeout for pg_wait_lsn() restricted to 1 day
--
Ivan Kartyshov
Postgres Professional: www.postgrespro.com
Attachment | Content-Type | Size |
---|---|---|
v13_0001-Procedure-wait-lsn.patch | text/x-diff | 17.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-03-19 22:35:27 | Re: Possibility to disable `ALTER SYSTEM` |
Previous Message | Andrew Dunstan | 2024-03-19 22:24:47 | Re: Possibility to disable `ALTER SYSTEM` |