Re: Replication failure, slave requesting old segments

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Phil Endecott <spam_from_pgsql_lists(at)chezphil(dot)org>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Replication failure, slave requesting old segments
Date: 2018-08-12 22:54:06
Message-ID: 20180812225406.GX3326@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Greetings,

* Phil Endecott (spam_from_pgsql_lists(at)chezphil(dot)org) wrote:
> OK. I think this is perhaps a documentation bug, maybe a missing
> warning when the master reads its configuration, and maybe (as you say)
> a bad default value.

If we consider it to be an issue worthy of a change then we should
probably just change the default value, and maybe not even allow it to
be set lower than '1'.

> Specifically, section 26.2.5 of the docs says:
>
> "If you use streaming replication without file-based continuous archiving,
> the server might recycle old WAL segments before the standby has received
> them. If this occurs, the standby will need to be reinitialized from a new
> base backup. You can avoid this by setting wal_keep_segments to a value
> large enough to ensure that WAL segments are not recycled too early, or by
> configuring a replication slot for the standby. If you set up a WAL archive
> that's accessible from the standby, these solutions are not required, since
> the standby can always use the archive to catch up provided it retains enough
> segments."
>
> OR, maybe the WAL reader that process the files that restore_command fetches
> could be smart enough to realise that it can skip over the gap at the end?

That strikes me as a whole lot more complication in something we'd
rather not introduce additional complications into without very good
reason. Then again, there was just a nearby discussion about how it'd
be nice if the backend could realize when a WAL file is complete, and I
do think that'll be more of an issue when users start configuring larger
WAL files, so perhaps we should figure out a way to handle this..

> Anyway. Do others agree that my issue was the result of wal_keep_segments=0 ?

Yeah, I've not spent much time actually looking at code around this, so
it'd be nice to get:

a) a reproducible case demonstrating it's happening
b) testing to see how long it's been this way
c) if setting wal_keep_segments=1 fixes it
d) perhaps some thought around if we could address this some other way

Thanks!

Stephen

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2018-08-13 00:14:32 Re: Replication failure, slave requesting old segments
Previous Message Phil Endecott 2018-08-12 21:56:11 Re: Replication failure, slave requesting old segments