Re: Fwd: standby stop replicating, then picked back up

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc: chris kim <chrisk(at)propaas(dot)com>, PostgreSQL mailing lists <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fwd: standby stop replicating, then picked back up
Date: 2017-11-08 00:32:19
Message-ID: CAB7nPqQ+C7Aob7kFGmczZbA3SzhN6q8x+FQucMKB-hsEX+N-Dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-in-general

On Wed, Nov 8, 2017 at 5:17 AM, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote:
> chris kim wrote:
>> I had a standby hang for a while, not replicating, but then it fixed
>> itself but I'm not sure why it happened in the first place. What would I
>> look into to see why this happened, or any insight into why is greatly
>> appreciated.
>
> You give us precious little information.
>
> If there is nothing suspicious in the log, and hot standby is enabled,
> and the standby is configured appropriately, it could be that a conflicting
> query on the standby block WAL application for a while.

I am understanding here the following: if a standby is stopped for a
long time, would it be able to catch up automatically? This is mainly
a matter of WAL segments recycled on the primary (or a standby for
cascading streaming). In short, when the primary completes two
checkpoints, it recycles or renames past WAL segments in pg_xlog that
it does not need for recovery because it is able to recover to a
consistent state.

If the standby uses a replication slot for recovery, then you could
allow a standby to plug in back as long as the primary's pg_xlog does
not get bloated too much, at which point the partition where pg_xlog
is located would cause the primary to go down because of space
exhaustion. Using a WAL archive can be worthy if standbys are taken
down for a long time though, with a proper recovery command, or a WAL
segment range copy, you could allow a standby to recover from an
earlier point. Strategies to adopt mainly depend on if taking a full
backup is more costly than a range of WAL segments, so the data folder
size of the primary instance matters as a decision-making parameter.
--
Michael

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2017-11-08 00:43:18 Re: Naming conventions for column names
Previous Message Rob Sargent 2017-11-07 23:16:47 Re: idle in transaction, why

Browse pgsql-in-general by date

  From Date Subject
Next Message Iaam Onkara 2017-11-12 18:26:33 md5 checksum of a given/previous row
Previous Message Laurenz Albe 2017-11-07 20:17:34 Re: Fwd: standby stop replicating, then picked back up