Re: Switching XLog source from archive to streaming when primary available

From: John H <johnhyvr(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Switching XLog source from archive to streaming when primary available
Date: 2024-08-29 20:58:31
Message-ID: CA+-JvFs3akgBD+hGN_paHbzVCj4qMj7QYKyQdsJekM7iAYt6yA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Thu, Aug 29, 2024 at 6:32 AM Bharath Rupireddy
<bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> In synchronous replication setup, until standby finishes fetching WAL
> from the archive, the commits on the primary have to wait which can
> increase the query latency. If the standby can connect to the primary
> as soon as the broken connection is restored, it can fetch the WAL
> soon and transaction commits can continue on the primary. Is my
> understanding correct? Is there anything more to this?
>

Yup, if you're running with synchronous_commit = 'on' with
synchronous_replicas, then you can
have the replica continue streaming changes into pg_wal faster than
WAL replay so commits
may be unblocked faster.

> I talked to Michael Paquier at PGConf.Dev 2024 and got some concerns
> about this feature for dealing with changing timelines. I can't think
> of them right now.

I'm not sure what the risk would be if the WAL/history files we sync
from streaming is the same as
we replay from archive.

> And, there were some cautions raised upthread -
> https://www.postgresql.org/message-id/20240305020452.GA3373526%40nathanxps13
> and https://www.postgresql.org/message-id/ZffaQt7UbM2Q9kYh%40paquier.xyz.

Yup agreed. I need to understand this area a lot better before I can
do a more in-depth review.

> Interesting. Yes, the restore script has to be smarter to detect the
> broken connections and distinguish whether the server is performing
> just the archive recovery/PITR or streaming from standby. Not doing it
> right, perhaps, can cause data loss (?).

I don't think there would be data-loss, only replay is stuck/slows down.
It wouldn't be any different today if the restore-script returned a
non-zero exit status.
The end-user could configure their restore-script to return a non-zero
status, based on some
condition, to move to streaming.

> > > However,
> > > + * exhaust all the WAL present in pg_wal before switching. If successful,
> > > + * the state machine moves to XLOG_FROM_STREAM state, otherwise it falls
> > > + * back to XLOG_FROM_ARCHIVE state.
> >
> > I think I'm missing how this happens. Or what "successful" means. If I'm reading
> > it right, no matter what happens we will always move to
> > XLOG_FROM_STREAM based on how
> > the state machine works?
>
> Please have a look at some discussion upthread on exhausting pg_wal
> before switching -
> https://www.postgresql.org/message-id/20230119005014.GA3838170%40nathanxps13.
> Even today, the standby exhausts pg_wal before switching to streaming
> from the archive.
>

I'm getting caught on the word "successful". My rough understanding of
WaitForWALToBecomeAvailable is that once you're in XLOG_FROM_PG_WAL, if it was
unsuccessful for whatever reason, it will still transition to
XLOG_FROM_STREAMING.
It does not loop back to XLOG_FROM_ARCHIVE if XLOG_FROM_PG_WAL fails.

> Nice catch. This is a problem. One idea is to disable
> streaming_replication_retry_interval feature for slot-less streaming
> replication - either when primary_slot_name isn't specified disallow
> the GUC to be set in assign_hook or when deciding to switch the wal
> source. Thoughts?

I don't think it's dependent on slot-less streaming. You would also run into the
issue if the WAL is no longer there on the primary, which can occur
with 'max_slot_wal_keep_size'
as well.
IMO the guarantee we need to make is that when we transition from
XLOG_FROM_STREAMING to
XLOG_FROM_ARCHIVE for a "fresh start", we should attempt to restore
from archive at least once.
I think this means that wal_source_switch_state should be reset back
to SWITCH_TO_STREAMING_NONE
whenever we transition to XLOG_FROM_ARCHIVE.
We've attempted the switch to streaming once, so let's not continually
re-try if it failed.

Thanks,

--
John Hsu - Amazon Web Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Murawski 2024-08-29 21:50:35 Re: pl/pgperl Patch for adding $_FN detail just like triggers have for $_TD
Previous Message Andrew Dunstan 2024-08-29 20:54:15 Re: pl/pgperl Patch for adding $_FN detail just like triggers have for $_TD