From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com> |
Subject: | Re: Switching XLog source from archive to streaming when primary available |
Date: | 2022-10-18 02:01:33 |
Message-ID: | CALj2ACUq_JebY-TSK=Z6xxj8u5oGDfO+5WE3rYYeaZRL_PbcaA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 11, 2022 at 8:40 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> On Mon, Oct 10, 2022 at 11:33:57AM +0530, Bharath Rupireddy wrote:
> > On Mon, Oct 10, 2022 at 3:17 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> >> I wonder if it would be better to simply remove this extra polling of
> >> pg_wal as a prerequisite to your patch. The existing commentary leads me
> >> to think there might not be a strong reason for this behavior, so it could
> >> be a nice way to simplify your patch.
> >
> > I don't think it's a good idea to remove that completely. As said
> > above, it might help someone, we never know.
>
> It would be great to hear whether anyone is using this functionality. If
> no one is aware of existing usage and there is no interest in keeping it
> around, I don't think it would be unreasonable to remove it in v16.
It seems like exhausting all the WAL in pg_wal before switching to
streaming after failing to fetch from archive is unremovable. I found
this after experimenting with it, here are my findings:
1. The standby has to recover initial WAL files in the pg_wal
directory even for the normal post-restart/first-time-start case, I
mean, in non-crash recovery case.
2. The standby received WAL files from primary (walreceiver just
writes and flushes the received WAL to WAL files under pg_wal)
pretty-fast and/or standby recovery is slow, say both the standby
connection to primary and archive connection are broken for whatever
reasons, then it has WAL files to recover in pg_wal directory.
I think the fundamental behaviour for the standy is that it has to
fully recover to the end of WAL under pg_wal no matter who copies WAL
files there. I fully understand the consequences of manually copying
WAL files into pg_wal, for that matter, manually copying/tinkering any
other files into/under the data directory is something we don't
recommend and encourage.
In summary, the standby state machine in WaitForWALToBecomeAvailable()
exhausts all the WAL in pg_wal before switching to streaming after
failing to fetch from archive. The v8 patch proposed upthread deviates
from this behaviour. Hence, attaching v9 patch that keeps the
behaviour as-is, that means, the standby exhausts all the WAL in
pg_wal before switching to streaming after fetching WAL from archive
for at least streaming_replication_retry_interval milliseconds.
Please review the v9 patch further.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v9-0001-Allow-standby-to-switch-WAL-source-from-archive-t.patch | application/octet-stream | 20.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Smith | 2022-10-18 02:35:47 | Re: Perform streaming logical transactions by background workers and parallel apply |
Previous Message | Michael Paquier | 2022-10-18 01:59:49 | Re: New "single-call SRF" APIs are very confusingly named |