From: | "Euler Taveira" <euler(at)eulerto(dot)com> |
---|---|
To: | "Amit Kapila" <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "Peter Eisentraut" <peter(at)eisentraut(dot)org>, "Bharath Rupireddy" <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Shlok Kyal" <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Michael Paquier" <michael(at)paquier(dot)xyz>, "Andres Freund" <andres(at)anarazel(dot)de>, "Ashutosh Bapat" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, "vignesh C" <vignesh21(at)gmail(dot)com> |
Subject: | Re: speed up a logical replica setup |
Date: | 2024-04-29 11:53:19 |
Message-ID: | 2e86cf53-c806-49d2-9a10-1a1257776d1f@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 29, 2024, at 6:56 AM, Amit Kapila wrote:
> On Wed, Mar 27, 2024 at 1:47 AM Euler Taveira <euler(at)eulerto(dot)com> wrote:
> >
> > On Tue, Mar 26, 2024, at 4:12 PM, Tomas Vondra wrote:
> >
> > Perhaps I'm missing something, but why is NUM_CONN_ATTEMPTS even needed?
> > Why isn't recovery_timeout enough to decide if wait_for_end_recovery()
> > waited long enough?
> >
> >
> > It was an attempt to decoupled a connection failure (that keeps streaming the
> > WAL) from recovery timeout. The NUM_CONN_ATTEMPTS guarantees that if the primary
> > is gone during the standby recovery process, there is a way to bail out.
> >
>
> I think we don't need to check primary if the WAL corresponding to
> consistent_lsn is already present on the standby. Shouldn't we first
> check that? Once we ensure that the required WAL is copied, just
> checking server_is_in_recovery() should be sufficient. I feel that
> will be a direct way of ensuring what is required rather than
> indirectly verifying the same (by checking pg_stat_wal_receiver) as we
> are doing currently.
How would you check it? WAL file? During recovery, you are not allowed to use
pg_current_wal_lsn.
Tomas suggested to me off-list that we should adopt a simple solution in
wait_for_end_recovery: wait for recovery_timeout without additional checks
(which means remove the pg_stat_wal_receiver logic). When we have additional
information that we can reliably use in this function, we can add it. Hence, it
is also easy to adjust the PG_TEST_TIMEOUT_DEFAULT to have stable tests.
--
Euler Taveira
EDB https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Zhijie Hou (Fujitsu) | 2024-04-29 11:58:09 | RE: Synchronizing slots from primary to standby |
Previous Message | Amit Kapila | 2024-04-29 10:58:17 | Re: speed up a logical replica setup |