From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, cary(dot)huang(at)highgo(dot)ca, pgsql-hackers(at)lists(dot)postgresql(dot)org, satyanarlapuram(at)gmail(dot)com |
Subject: | Re: Switching XLog source from archive to streaming when primary available |
Date: | 2022-09-09 16:59:50 |
Message-ID: | 20220909165950.GB2254174@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 09, 2022 at 12:14:25PM +0530, Bharath Rupireddy wrote:
> On Fri, Sep 9, 2022 at 10:57 AM Kyotaro Horiguchi
> <horikyota(dot)ntt(at)gmail(dot)com> wrote:
>> At Thu, 8 Sep 2022 10:53:56 -0700, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote in
>> > My general point is that we should probably offer some basic preventative
>> > measure against flipping back and forth between streaming and archive
>> > recovery while making zero progress. As I noted, maybe that's as simple as
>> > having WaitForWALToBecomeAvailable() attempt to restore a file from archive
>> > at least once before the new parameter forces us to switch to streaming
>> > replication. There might be other ways to handle this.
>>
>> +1.
>
> Hm. In that case, I think we can get rid of timeout based switching
> mechanism and have this behaviour - the standby can attempt to switch
> to streaming mode from archive, say, after fetching 1, 2 or a
> configurable number of WAL files. In fact, this is the original idea
> proposed by Satya in this thread.
IMO the timeout approach would be more intuitive for users. When it comes
to archive recovery, "WAL segment" isn't a standard unit of measure. WAL
segment size can differ between clusters, and WAL files can have different
amounts of data or take different amounts of time to replay. So I think it
would be difficult for the end user to decide on a value. However, even
the timeout approach has this sort of problem. If your parameter is set to
1 minute, but the current archive takes 5 minutes to recover, you won't
really be testing streaming replication once a minute. That would likely
need to be documented.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2022-09-09 17:08:09 | Re: Add tracking of backend memory allocated to pg_stat_activity |
Previous Message | Tom Lane | 2022-09-09 16:57:53 | Re: Remove redundant code in pl_exec.c |